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© Disclosed are a family of synthetic proteins having binding affinity for a preselected antigen, and multifunc- 
tional proteins having such affinity. The proteins are characterized by one or more sequences of amino acids 
constituting a region which behaves as a biosynthetic antibody binding site (BABS). The sites comprise VVV L or 
V L -v*H-like single chains wherein the V H and V L -like sequences are attached by a polypeptide linker, or individual 
V H or V L -like domains. The binding domains comprise linked CDR and FR regions, which may be derived from 
separate immunoglobulins. The proteins may also include other polypeptide sequences which function, e.g., as 
an enzyme, toxin, binding site, or site for attachment to an immobilization media or radioactive atom. Methods 
are disclosed for producing the proteins, for designing BABS having any specificity that can be elicited by in 
vivo generation of antibody, for producing analogs thereof, and for producing multifunctional synthetic proteins 
which are self-targeted by virtue of their binding site region. 
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The United States Government has rights in this application pursuant to small business innovation 
research grant numbers SSS-4 R43 CA39870-01 and SSS-4 2 R44 CA39870-02. 

Reference to Related Applications 

5 

This application is a continuation-in-part of copending U.S. application serial number 052,800 filed May 
21, 1987, the disclosure of which is incorporated herein by reference. 

Background of the Invention 

This invention relates to novel compositions of matter, hereinafter called targeted multifunctional 
proteins, useful, for example, in specific binding assays, affinity purification, biocataJysis, drug targeting, 
imaging, immunological treatment of various oncogenic and infectious diseases, and in other contexts. More 
specifically, this invention relates to biosynthetic proteins expressed from recombinant DNA as a single 

15 polypeptide chain comprising plural regions, one of which has a structure similar to an antibody binding 
site, and an affinity for a preselected antigenic determinant, and another of which has a separate function, 
and may be biologically active, designed to bind to ions, or designed to facilitate immobilization of the 
protein. This invention also relates to the binding proteins per se, and methods for their construction. 

There are five classes of human antibodies. Each has the same basic structure (see Figure 1), or 

20 multiple thereof, consisting of two identical polypeptides called heavy (H) chains (molecularly weight 
approximately 50,000 d) and two identical light (L) chains (molecular weight approximately 25,000 d). Each 
of the five antibody classes has a similar set of light chains and a distinct set of heavy chains. A light chain 
is composed of one variable and one constant domain, while a heavy chain is composed of one variable 
and three or more constant domains. The combined variable domains of a paired light and heavy chain are 

25 known as the Fv region, or simply n Fv". The Fv determines the specificity of the immunoglobulin, the 
constant regions have other functions. 

Amino acid sequence data indicate that each variable domain comprises three hypervariable regions or 
loops, sometimes called complementarity determining regions or "CDRs" flanked by four relatively 
conserved framework regions or "FRs" (Kabat et. al., Sequences of Proteins oflmmunological Interest [U.S. 

30 Department of Health and Human Services, third edition, 1983, fourth edition, 1987]). The hypervariable 
regions have been assumed to be responsible for the binding specificity of individual antibodies and to 
account for the diversity of binding of antibodies as a protein class. 

Monoclonal antibodies have been used both as diagnostic and therapeutic agents. They are routinely 
produced according to established procedures by hybridomas generated by fusion of mouse lymphoid cells 

35 with an appropriate mouse myeloma cell line. 

The literature contains a host of references to the concept of targeting bioactive substances such as 
drugs, toxins, and enzymes to specific points in the body to destroy or locate malignant cells or to induce a 
localized drug or enzymatic effect. It has been proposed to achieve this effect by conjugating the bioactive 
substance to monoclonal antibodies (see, e.g., Vogei, Immunoconjugates. Antibody Conjugates in 

40 Radioimaging and Therapy of Cancer, 1987, N.Y., Oxford University Press; and Ghose et al. (1978) J. Natl. 
Cancer Inst. 61 :657-676, ). However, non-human antibodies induce an immune response when injected into 
humans. Human monoclonal antibodies may alleviate this problem, but they are difficult to produce by cell 
fusion techniques since, among other problems, human hybridomas are notably unstable, and removal of 
immunized spleen cells from humans is not feasible. 

45 Chimeric antibodies composed of human and non-human amino acid sequences potentially have 
improved therapeutic value as they presumably would elicit less circulating human antibody against the 
non-human immunoglobulin sequences. Accordingly, hybrid antibody molecules have been proposed which 
consist of amino acid sequences from different mammalian sources. The chimeric antibodies designed thus 
far comprise variable regions from one mammalian source, and constant regions from human or another 

50 mammalian source (Morrison et al. (1984) Proc. Natl. Acad. Sci. U.SA, 81:5851-6855; Neuberger et al. 
(1984) Nature 312:604-608; Sahagan et al. (1986) J. Immunol. 137:1066-1074; EPO application nos. 
04302368.0, Genentech; 85102665.3, Research Development Corporation of Japan; 85305604.2, Stanford; 
P.C.T. application no. PCT/GB85/00392, Ceiltech Limited). 

It has been reported that binding function is localized to the variable domains of the antibody molecule 

55 located at the amino terminal end of both the heavy and light chains. The variable regions remain 
noncovalently associated (as V H V L dimers, termed Fv regions) even after proteolytic cleavage from the 
native antibody molecule, and retain much of their antigen recognition and binding capabilities (see, for 
example, Inbar et al.. Proc. Natl. Acad. Sci. U.SA (1972) 69:2659-2662; Hochman et. al. (1973) Biochem. 
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12:1130-1135; and (1976) Biochem. 15:2706-2710; Sharon and Givol (1976) Biochem. 15:1591-1594; 
Rosenblatt and Haber (1978) Biochem. 17:3877-3882; Ehrlich et at. (1980) Biochem. 19:4091-40996). 
Methods of manufacturing two-chain Fv substantially free of constant region using recombinant DNA 
techniques are disclosed in U.S. 4,642,334 and corresponding published specification EP 088,994. 

5 

Summary of the Invention 

In one aspect the invention provides a single chain multifunctional biosynthetic protein expressed from 
a single gene derived by recombinant DNA techniques. The protein comprises a biosynthetic antibody 

w binding site (BABS) comprising at least one protein domain capable of binding to a preselected antigenic 
determinant. The amino acid sequence of the domain is homologous to at least a portion of the sequence of 
a variable region of an immunoglobulin molecule capable of binding the preselected antigenic determinant. 
Peptide bonded to the binding site is a polypeptide consisting of an effector protein having a conformation 
suitable for biological activity in a mammal, an amino acid sequence capable of sequestering ions, or an 

75 amino acid sequence capable of selective binding to a solid support. 

In another aspect, the invention provides biosynthetic binding site protein comprising a single polypep- 
tide chain defining two polypeptide domains connected by a polypeptide linker. The amino acid sequence 
of each of the domains comprises a set of complementarity determining regions (CDRs) interposed 
between a set of framework regions (FRs), each of which is respectively homologous with at least a portion 

20 of the CDRs and FRS from an immunoglobulin molecule. At least one of the domains comprises a set of 
CDR amino acid sequences and a set of FR amino acid sequences at least partly homologous to different 
immunoglobulins. The two polypeptide domains together define a hybrid synthetic binding site having 
specificity for a preselected antigen, determined by the selected CDRs. 

In still another aspect, the invention provides biosynthetic binding protein comprising a single polypep- 

25 tide chain defining two domains connected by a polypeptide linker. The amino acid sequence of each of the 
domains comprises a set of CDRs interposed between a set of FRs, each of which is respectively 
homologous with at least a portion of the CDRs and FRs from an immunoglobulin molecule. The linker 
comprises plural, peptide-bonded amino acids defining a polypeptide of a length sufficient to span the 
distance between the C terminal end of one of the domains and N terminal end of the other when the 

30 binding protein assumes a conformation suitable for binding. The linker comprises hydrophilic amino acids 
which together preferably constitute a hydrophilic sequence. Linkers which assume an unstructured 
polypeptide configuration in aqueous solution work well. The binding protein is capable of binding to a 
preselected antigenic site, determined by the collective tertiary structure of the sets of CDRs held in proper 
conformation by the sets of FRs. Preferably, the binding protein has a specificity at least substantially 

35 identical to the binding specificity of the immunoglobulin molecule used as a template for the design of the 
CDR regions. Such structures can have a binding affinity of at least 10 6 . M~\ and preferably 10 8 M" 1 . 

In preferred aspects, the FRs of the binding protein are homologous to at least a portion of the FRs 
from a human immunoglobulin, the linker spans at least about 40 angstroms; a polypeptide spacer is 
incorporated in the multifunctional protein between the binding site and the second polypeptide; and the 

40 binding protein has an affinity for the preselected antigenic determinant no less than two orders of 
magnitude less than the binding affinity of the immunoglobulin molecule used as a template for the CDR 
regions of the binding protein. The preferred linkers and spacers are cysteine-free. The linker preferably 
comprises amino acids having unreactive side groups, e.g., alanine and glycine. Linkers and spacers can be 
made by combining plural consecutive copies of an amino acid sequence, e.g., (Gly* Ser>3. The invention 

45 also provides DNAs encoding these proteins and host cells harboring and capable of expressing these 
DNAs. 

As used herein, the phrase biosynthetic antibody binding site or BABS means synthetic proteins 
expressed from DNA derived by recombinant techniques. BABS comprise biosynthetically produced 
sequences of amino acids defining polypeptides designed to bind with a preselected antigenic material. The 

50 structure of these synthetic polypeptides is unlike that of naturally occurring antibodies, fragments thereof, 
e.g., Fv, or known synthetic polypeptides or "chimeric antibodies" in that the regions of the BABS 
responsible for specificity and affinity of binding, (analogous to native antibody variable regions) are linked 
by peptide bonds, expressed from a single DNA, and may themselves be chimeric, e.g., may comprise 
amino acid sequences homologous to portions of at least two different antibody molecules. The BABS 

55 embodying the invention are biosynthetic in the sense that they are synthesized in a cellular host made to 
express a synthetic DNA, that is, a recombinant DNA made by ligation, of plural, chemically synthesized 
oligonucleotides, or by ligation of fragments of DNA derived from the genome of a hybridoma, mature B cell 
clone, or a cDNA library derived from such natural sources. The proteins of the invention are properly 

3 



EP 0 623 679 A1 

characterized as "binding sites" in that these synthetic molecules are designed to have specific affinity for a 
preselected antigenic determinant. The polypeptides of the invention comprise structures patterned after 
regions of native antibodies known to be responsible for antigen recognition. 

Accordingly, it is an object of the invention to provide novel multifunctional proteins comprising one or 
5 more effector proteins and one or more biosynthetic antibody binding sites, and to provide DNA sequences 
which encode the proteins. Another object is to provide a generalized method for producing biosynthetic 
antibody binding site polypeptides of any desired specificity. 

Brief Description of the Drawing 

10 

The foregoing and other objects of this invention, the various features thereof, as well as the invention 
itself, may be more fully understood from the following description, when read together with the accom- 
panying drawings. 

Figure 1A is a schematic representation of an intact IgG antibody molecule containing two light chains, 

75 each consisting of one variable and one constant domain, and two heavy chains, each consisting of one 
variable and three constant domains. Figure 1 B is a schematic drawing of the structure of Fv proteins (and 
DNA encoding them) illustrating V H and V L domains, each of which comprises four framework (FR) regions 
and three complementarity determining (CDR) regions. Boundaries of CDRs are indicated, by way of 
example, for monoclonal 26-10, a well known and characterized murine monoclonal specific for digoxin. 

20 Figure 2A-2E are schematic representations of some of the classes of reagents constructed in 
accordance with the invention, each of which comprises a biosynthetic antibody binding site. 

Figure 3 discloses five amino acid sequences (heavy chains) in single letter code lined up vertically to 
facilitate understanding of the invention. Sequence 1 is the known native sequence of V H from murine 
monoclonal glp-4 (anti-lysozyme). Sequence 2 is the known native sequence of V H from murine monoclonal 

25 26-10 (anti-digoxin). Sequence 3 is a BABS comprising the FRs from 26-10 V H and the CDRs from glp-4 
V H . The CDRs are identified in lower case letters; restriction sites in the DNA used to produce chimeric 
sequence 3 are also identified. Sequence 4 is the known native sequence of V H from human myeloma 
antibody NEWM. Sequence 5 is a BABS comprising the FRs from NEWM V H and the CDRs from glp-4 V H , 
i.e., illustrates a "humanized" binding site having a human framework but an affinity for lysozyme similar to 

30 murine glp-4. 

Figures 4A-4F are the synthetic nucleic acid sequences and encoded amino acid sequences of (4A) the 
heavy chain variable domain of murine anti-digoxin monoclonal 26-10; (4B) the light chain variable domain 
of murine anti-digoxin monoclonal 26-10; (4C) a heavy chain variable domain of a BABS comprising CDRs 
of glp-4 and FRs of 26-10; (4D) a light chain variable region of the same BABS; (4E) a heavy chain variable 

35 region of a BABS comprising CDRs of glp-4 and FRs of NEWM; and (4F) a light chain variable region 
comprising CDRs of glp-4 and FRs of NEWM. Delineated are FRs, CDRs, and restriction sites for 
endonuclease digestion, most of which were introduced during design of the DNA. 

Figure 5 is the nucleic acid and encoded amino acid sequence of a host DNA (V H ) designed to facilitate 
insertion of CDRs of choice. The DNA was designed to have unique 6-base sites directly flanking the CDRs 

40 so that relatively small oligonucleotides defining portions of CDRs can be readily inserted, and to have other 
sites to facilitate manipulation of the DNA to optimize binding properties in a given construct. The 
framework regions of the molecule correspond to murine FRs (Figure 4A). 

Figures 6A and 6B are multifunctional proteins (and DNA encoding them) comprising a single chain 
BABS with the specificity of murine monoclonal 26-10, linked through a spacer to the FB fragment of 

45 protein A, here fused as a leader, and constituting a binding site for Fc. The spacer comprises the 11 C- 
terminal amino acids of the FB followed by Asp-Pro (a dilute acid cleavage site). The single chain BABS 
comprises sequences mimicking the V H and V L (6A) and the V L and V H (6B) of murine monoclonal 26-10. 
The V u in construct 6A is altered at residue 4 where valine replaces methionine present in the parent 26-10 
sequence. These constructs contain binding sites for both Fc and digoxin. Their structure may be 

50 summarized as; 

(6A) FB-Asp-Pro-VH-(Gly4-Ser) 3 -V L , 

and 

(6B) FB-Asp-Pro-V L -(Gly*-Ser) 3 -V„, 
where (G!y4-Ser)3 is a polypeptide linker. 
55 In Figures 4A-4E and 6A and 6B, the amino acid sequence of the expression products start after the 
GAATTC sequences, which codes for an EcoRI splice site, translated as Glu-Phe on the drawings. 

Figure 7A is a graph of percent of maximum counts bound of radioiodinated digoxin versus concentra- 
tion of binding protein adsorbed to the plate comparing the binding of native 26-10 (curve 1) and the 
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construct of Figure 6A and Figure 2B renatured using two different procedures (curves 2 and 3). Figure 7B 
is a graph demonstrating the bifunctionality of the FB-(26-10) BABS adhered to rnicrotiter plates through the 
specific binding of the binding site to the digoxin-BSA coat on the plate. Figure 7B shows the percent 
inhibition of ,25 l-rabbit-lgG binding to the FB domain of the FB BABS by the addition of IgG, protein A, FB, 

5 murine kjG2a, and murine IgGI. 

Figure 8 is a schematic representation of a model assembled DNA sequence encoding a multifunctional 
biosynthetic protein comprising a leader peptide (used to aid expression and thereafter cleaved), a binding 
site, a spacer, and an effector molecule attached as a trailer sequence. 

Figure 9A-9E are exemplary synthetic nucleic acid sequences and corresponding encoded amino acid 

10 sequences of binding sites of different specificities: (A) FRs from NEWM and CDRs from 26-10 having the 
digoxin specificity of murine monoclonal 26-10; (B) FRs from 26-10, and CDRs from G-loop-4 (glp-4) having 
lysozyme specificity; (C) FRs and CDRs from MOPC-315 having dinrtrophenol (DNF) specificity; (D) FRs 
and CDRs from an anti-CEA monoclonal antibody; (E) FRs in both V H and V L and CDRi and CDR3 in V H , 
and CDRi, CDR2, and CDR3 in V L from an anti-CEA monoclonal antibody; CDR2 in V H is a CDR2 

75 consensus sequence found in most immunoglobulin V H regions. 

Figure 10A is a schematic representation of the DNA and amino acid sequence of a leader peptide 
(MLE) protein with corresponding DNA sequence and some major restriction sites. Figure 10B shows the 
design of an expression plasmid used to express MLE-BABS (26-10). During construction of the gene, 
fusion partners were joined at the EcoRI site that is shown as part of the leader sequence. The pBR322 

20 plasmid. opened at the unique Sspl and Pstl sites, was combined in a 3-part ligation with an Sspl to EcoRI 
fragment bearing the trp promoter and MLE leader and with an EcoRI to Pstl fragment carrying the BABS 
gene. The resulting expression vector confers tetracycline resistance on positive transformants. 

Figure 11 is an SDS-polyacrylamide gel (15%) of the (26-10) BABS at progressive stages of 
purification. Lane 0 shows low molecular weight standards; lane 1 is the MLE-BABS fusion protein; lane 2 is 

25 an acid digest of this material; lane 3 is the pooled DE-52 chromatographed protein; lanes 4 and 5 are the 
same oubain-Sepharose pool of single chain BABS except that lane 4 protein is reduced and lane 5 protein 
is unreduced. 

Figure 12 shows inhibition curves for 26-10 BABS and 26-10 Fab species, and indicates the relative 
affinities of the antibody fragment for the indicated cardiac glycosides. 
30 Figures 13A and 13B are plots of digoxin binding curves. (A) shows 26-10 BABS binding isotherm and 
Sips plot (inset), and (B) shows 26-10 Fab binding isotherm and Sips plot (inset). 

Figure 14 is a nucleic acid sequence and corresponding amino acid sequence of a modified FB dimer 
leader sequence and various restriction sites. 

Figure 15A-15H are nucleic acid sequences and corresponding amino acid sequences of biosynthetic 
35 multifunctional proteins including a single chain BABS and various biologically active protein trailers linked 
via a spacer sequence. Also indicated are various endonuclease digestion sites. The trailing sequences are 
(A) epidermal growth factor (EGF); (B) streptavidin; (C) tumor necrosis factor (TNF); (D) calmodulin; (E) 
platelet derived growth factor-beta (PDGF-beta); (F) ricin; and (G) interleukin-2, and (H) an FB-FB dimer. 

40 Description 

The invention will first be described in its broadest overall aspects with a more detailed description 
following. 

A class of novel biosynthetic, bi or multifunctional proteins has now been designed and engineered 
45 which comprise biosynthetic antibody binding sites, that is, "BABS" or biosynthetic polypeptides defining 
structure capable of selective antigen recognition and preferential antigen binding, and one or more peptide- 
bonded additional protein or polypeptide regions designed to have a preselected property. Examples of the 
second region include amino acid sequences designed to sequester ions, which makes the protein suitable 
for use as an imaging agent, and sequences designed to facilitate immobilization of the protein for use in 
50 affinity chromatography and solid phase immunoassay. Another example of the second region is a bioactive 
effector molecule, that is, a protein having a conformation suitable for biological activity, such as an 
enzyme, toxin, receptor, binding site, growth factor, cell differentiation factor, lymphokine, cytokine, 
hormone, or anti-metabolite. This invention features synthetic, multifunctional proteins comprising these 
regions peptide bonded to one or more biosynthetic antibody binding sites, synthetic, single chain proteins 
55 designed to bind preselected antigenic determinants with high affinity and specificity, constructs containing 
multiple binding sites linked together to provide multipoint antigen binding and high net affinity and 
specificity, DNA encoding these proteins prepared by recombinant techniques, host cells harboring these 
DNAs, and methods for the production of these proteins and DNAs. 
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The invention requires recombinant production of single chain binding sites having affinity and 
specificity for a predetermined antigenic determinant This technology has been developed and is disclosed 
herein. In view of this disclosure, persons skilled in recombinant DNA technology, protein design, and 
protein chemistry can produce such sites which, when disposed in solution, have high binding constants (at 
5 least 10*. preferably 10 s M~\) and excellent specificity. 

The design of the BABS is based on the observation that three subregions of the variable domain of 
each of the heavy and light chains of native immunoglobulin molecules collectively are responsible for 
antigen recognition and binding. Each of these subregions, called herein "complementarity determining 
regions" or CDRs, consists of one of the hypervariable regions or loops and of selected amino acids or 

w amino acid sequences disposed in the framework regions or FRs which flank that particular hypervariable 
region, ft has now been discovered that FRs from diverse species are effective to maintin CDRs from 
diverse other species in proper conformation so as to achieve true immunochemical binding properties in a 
biosynthetic protein. It has also been discovered that biosynthetic domains mimicking the structure of the 
two chains of an immunoglobulin binding site may be connected by a polypeptide linker while closely 

is approaching, retaining, and often improving their collective binding properties. 

The binding site region of the multifunctional proteins comprises at least one. and preferably two 
domains, each of which has an amino acid sequence homologous to portions of the CDRs of the variable 
domain of an immunoglobulin light or heavy chain, and other sequence homologous to the FRs of the 
variable domain of the same, or a second, different immunoglobulin light or heavy chain. The two domain 

20 binding site construct also includes a polypeptide linking the domains. Polypeptides so constructed bind a 
specific preselected antigen determined by the CDRs held in proper conformation by the FRs and the 
linker. Preferred structures have human FRs, i.e., mimic the amino acid sequence of at least a portion of the 
framework regions of a human immunoglobulin, and have linked domains which together comprise structure 
mimicking a Vh-V l or V L -V H immunoglobulin two-chain binding site. CDR regions of a mammalian 

25 immunoglobulin, such as those of mouse, rat, or human origin are preferred. In one preferred embodiment, 
the biosynthetic antibody binding site comprises FRs homologous with a portion of the FRs of a human 
immunoglobulin and CDRs homologous with CDRs from a mouse or rat immunoglobulin. This type of 
chimeric polypeptide displays the antigen binding specificity of the mouse or rat immunoglobulin, while its 
human framework minimizes human immune reactions. In addition, the chimeric polypeptide may comprise 

30 other amino acid sequences, ft may comprise, for example, a sequence homologous to a portion of the 
constant domain of an immunoglobulin, but preferably is free of constant regions (other than FRs). 

The binding site region(s) of the chimeric proteins are thus single chain composite polypeptides 
comprising a structure which in solution behaves like an antibody binding site. The two domain, single chain 
composite polypeptide has a structure patterned after tandem V H and V L domains, but with the carboxyl 

05 terminal of one attached through a linking amino acid sequence to the amino terminal of the other. The 
linking amino acid sequence may or may not itself be antigenic or biologically active. It preferably spans a 
distance of at least about 40A, i.e., comprises at least about 14 amino acids, and comprises residues which 
together present a hydrophilic, relatively unstructured region. Unking amino acid sequences having little or 
no secondary structure work well. Optionally, one or a pair of unique amino acids or amino acid sequences 

40 recognizable by a site specific cleavage agent may be included in the linker. This permits the V H and V L - 
like domains to be separated after expression, or the linker to be excised after refolding of the binding site. 

Either the amino or carboxyl terminal ends (or both ends) of these chimeric, single chain binding sites 
are attached to an amino acid sequence which itself is bioactive or has some other function to produce a 
Afunctional or multifunctional protein. For example, the synthetic binding site may include a leader and/or 

45 trailer sequence defining a polypeptide having enzymatic activity, independent affinity for an antigen 
different from the antigen to which the binding site is directed, or having other functions such as to provide 
a convenient site of attachment for a radioactive ion, or to provide a residue designed to link chemically to a 
solid support. This fused, independently functional section of protein should be distinguished from fused 
leaders used simply to enhance expression in prokaryotic host cells or yeasts. The multifunctional proteins 

so also should be distinguished from the "conjugates" disclosed in the prior art comprising antibodies which, 
after expression, are linked chemically to a second moiety. 

Often, a series of amino acids designed as a "spacer" is interposed between the active regions of the 
multifunctional protein. Use of such a spacer can promote independent refolding of the regions of the 
protein. The spacer also may include a specific sequence of amino acids recognized by an endopeptidase, 

55 for example, endogenous to a target cell (e.g., one having a surface protein recognized by the binding site) 
so that the bioactive effector protein is cleaved and released at the target. The second functional protein 
preferably is present as a trailer sequence, as trailers exhibit less of a tendency to interfere with the binding 
behavior of the BABS. 
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The therapeutic use of such "self-targeted" bioactive proteins offers a number of advantages over 
conjugates of immunoglobulin fragments or complete antibody molecules: they are stable, less im- 
munogenic and have a lower molecular weight; they can penetrate body tissues more rapidly for purposes 
of imaging or drug delivery because of their smaller size; and they can facilitate accelerated clearance of 

s targeted isotopes or drugs. Furthermore, because design of such structures at the DNA level as disclosed 
herein permits ready selection of bioproperties and specificities, an essentially limitless combination of 
binding sites and bioactive proteins is possible, each of which can be refined as disclosed herein to 
optimize independent activity at each region of the synthetic protein. The synthetic proteins can be 
expressed in procaryotes such as E. colt , and thus are less costly to produce than immunoglobulins or 

io fragments thereof which require expression in cultured animal cell lines. 

The invention thus provides a family of recombinant proteins expressed from a single piece of DNA, ail 
of which have the capacity to bind specifically with a predetermined antigenic determinant. The preferred 
species of the proteins comprise a second domain which functions independently of the binding region. In 
this aspect the invention provides an array of "self-targeted" proteins which have a bioactive function and 

;5 which deliver that function to a locus determined by the binding site's specificity. It also provides 
biosynthetic binding proteins having attached polypeptides suitable for attachment to immobilization 
matrices which may be used in affinity chromatography and solid phase immunoassay applications, or 
suitable for attachment to ions, e.g., radioactive ions, which may be used for in vivo imaging. 

The successful design and manufacture of the proteins of the invention depends on the ability to 

20 produce biosynthetic binding sites, and most preferably, sites comprising two domains mimicking the 
variable domains of immunoglobulin connected by a linker. 

As is now well known, Fv, the minimum antibody fragment which contains a complete antigen 
recognition and binding site, consists of a dimer of one heavy and one light chain variable domain in 
noncovalent association (Figure 1A). It is in this configuration that the three complementarity determining 

25 regions of each variable domain interact to define an antigen binding site on the surface of the Vh-V l dimer. 
Collectively, the six complementarity determining regions (see Figure 1 B) confer antigen binding specificity 
to the antibody. FRs flanking the CDRs have a tertiary structure which is essentially conserved in native 
immunoglobulins of species as diverse as human and mouse. These FRs serve to hold the CDRs in their 
appropriate orientation. The constant domains are not required for binding function, but may aid in 

30 stabilizing Vh-V l interaction. Even a single variable domain (or half of an Fv comprising only three CDRs 
specific for an antigen) has the ability to recognize and bind antigen, although at a lower affinity than an 
entire binding site (Painter et al. (1972) Biochem. 11:1 327-1 337). 

This knowledge of the structure of immunoglobulin proteins has now been exploited to develop 
multifunctional fusion proteins comprising biosynthetic antibody binding sites and one or more other 

35 domains. 

The structure of these biosynthetic proteins in the region which impart the binding properties to the 
protein is analogous to the Fv region of a natural antibody. It comprises at least one, and preferably two 
domains consisting of amino acids defining V H and V L -like polypeptide segments connected by a linker 
which together form the tertiary molecular structure responsible for affinity and specificity. Each domain 
40 comprises a set of amino acid sequences analogous to immunoglobulin CDRs held in appropriate 
conformation by a set of sequences analogous to the framework regions (FRs) of an Fv fragment of a 
natural antibody. 

The term CDR, as used herein, refers to amino acid sequences which together define the binding 
affinity and specificity of the natural Fv region of a native immunoglobulin binding site, or a synthetic 
45 polypeptide which mimics this function. CDRs typically are not wholly homologous to hypervariable regions 
of natural Fvs, but rather aiso may include specific amino acids or amino acid sequences which flank the 
hypervariable region and have heretofore been considered framework not directly determinitive of com- 
plementarity. The term FR, as used herein, refers to amino acid sequences flanking or interposed between 
CDRs. 

so The CDR and FR polypeptide segments are designed based on sequence analysis of the Fv region of 
preexisting antibodies or of the DNA encoding them. In one embodiment, the amino acid sequences 
constituting the FR regions of the BABS are analogous to the FR sequences of a first preexisting antibody, 
for example, a human IgG. The amino acid sequences constituting the CDR regions are analogous to the 
sequences from a second, different preexisting antibody, for example, the CDRs of a murine IgG. 

55 Alternatively, the CDRs and FRs from a single preexisting antibody from, e.g., an unstable or hard to culture 
hybridoma, may be copied in their entirety. 

Practice of the invention enables the design and biosynthesis of various reagents, all of which are 
characterized by a region having affinity for a preselected antigenic determinant. The binding site and other 
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regions of the biosynthetic protein are designed with the particular planned utility of the protein in mind. 
Thus, if the reagent is designed for intravascular use in mammals, the FR regions may comprise amino 
acids similar or identical to at least a portion of the framework region amino acids of antibodies native to 
that mammalian species. On the other hand, the amino acids comprising the CDRs may be analogous to a 

5 portion of the amino acids from the hypervariable region (and certain flanking amino acids) of an antibody 
having a known affinity and specificity, e.g., a murine or rat monoclonal antibody. 

Other sections of native immunoglobulin protein structure, e.g., C H and C L , need not be present and 
normally are intentionally omitted from the biosynthetic proteins. However, the proteins of the invention 
normally comprise additional polypeptide or protein regions defining a bioactive region, e.g., a toxin or 

io enzyme, or a site onto which a toxin or a remotely detectable substance can be attached. 

The invention thus can provide intact biosynthetic antibody binding sites analogous to VVV L dimers, 
either non^covalently associated, disulfide bonded, or preferably linked by a polypeptide sequence to form 
a composite VVV L or V L -V H polypeptide which may be essentially free of antibody constant region. The 
invention also provides proteins analogous to an independent V H or V L domain, or dimers thereof. Any of 

75 these proteins may be provided in a form linked to, for example, amino acids analogous or homologous to a 
bioactive molecule such as a hormone or toxin. 

Connecting the independently functional regions of the protein is a spacer comprising a short amino 
acid sequence whose function is to separate the functional regions so that they can independently assume 
their active tertiary conformation. The spacer can consist of an amino acid sequence present on the end of 

20 a functional protein which sequence is not itself required for its function, and/or specific sequences 
engineered into the protein at the DNA level. 

The spacer generally may comprise between 5 and 25 residues. Its optimal length may be determined 
using constructs of different spacer lengths varying, for example, by units of 5 amino acids, The specific 
amino acids in the spacer can vary. Cysteines should be avoided. Hydrophilic amino acids are preferred. 

25 The spacer sequence may mimic the sequence of a hinge region of an immunoglobulin. It may also be 
designed to assume a structure, such as a helical structure. Proteolytic cleavage sites may be designed 
into the spacer separating the variable region-like sequences from other pendant sequences so as to 
facilitate cleavage of intact BABS, free of other protein, or so as to release the bioactive protein in vivo . 
Figures 2A-2E illustrate five examples of protein structures embodying the invention that can be 

30 produced by following the teaching disclosed herein. All are characterized by a biosynthetic polypeptide 
defining a binding site 3, comprising amino acid sequences comprising CDRs and FRs, often derived from 
different immunoglobulins, or sequences homologous to a portion of CDRs and FRs from different 
immunoglobulins. Figure 2A depicts a single chain construct comprising a polypeptide domain 10 having an 
amino acid sequence analogous to the variable region of an immunoglobulin heavy chain, bound through its 

as carboxyl end to a polypeptide linker 12, which in turn is bound to a polypeptide domain 14 having an amino 
acid sequence analogous to the variable region of an immunoglobulin light chain. Of course, the light and 
heavy chain domains may be in reverse order. Alternatively, the binding site may comprise two substan- 
tially homologous amino acid sequences which are both analogous to the variable region of an im- 
munoglobulin heavy or light chain. 

40 The linker 12 should be long enough (e.g., about 15 amino acids or about 40 A to permit the chains 10 
and 14 to assume their proper conformation. The linker 12 may comprise an amino acid sequence 
homologous to a sequence identified as "self" by the species into which it will be introduced, if drug use is 
intended. For example, the linker may comprise an amino acid sequence patterned after a hinge region of 
an immunoglobulin. The linker preferably comprises hydrophilic amino acid sequences. It may also 

45 comprise a bioactive polypeptide such as a cell toxin which is to be targeted by the binding site, or a 
segment easily labelled by a radioactive reagent which is to be delivered, e.g., to the site of a tumor 
comprising an epitope recognized by the binding site. The linker may also include one or two built-in 
cleavage sites, i.e., an amino acid or amino acid sequence susceptible to attack by a site specific cleavage 
agent as described below. This strategy permits the V H and V L -like domains to be separated after 

so expression, or the linker to be excised after folding while retaining the binding site structure in non-covalent 
association. The amino acids of the linker preferably are selected from among those having relatively small, 
unreactive side chains. Alanine, serine, and glycine are preferred. 

Generally, the design of the linker involves considerations similar to the design of the spacer, excepting 
that binding properties of the linked domains are seriously degraded if the linker sequence is shorter than 

55 about 20A in length, i.e., comprises less than about 10 residues. Linkers longer than the approximate 40A 
distance between the N terminal of a native variable region and the C-terminal of its sister chain may be 
used, but also potentially can diminish the BABS binding properties. Linkers comprising between 12 and 18 
residues are preferred. The preferred length in specific constructs may be determined by varying linker 
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length first by units of 5 residues, and second by units of 1-4 residues after determining the best multiple of 
the pentameric starting units. 

Additional proteins or polypeptides may be attached to either or both the amino or carboxyl termini of 
the binding site to produce multifunctional proteins of the type illustrated in Figures 2B-2E. As an example, 

5 in Figure 2B, a helically coiled polypeptide structure 16 comprises a protein A fragment (FB) linked to the 
amino terminal end of a VVIike domain 10 via a spacer 18. Figure 2C illustrates a bifunctional protein 
having an effector polypeptide 20 linked via spacer 22 to the carboxyl terminus of polypeptide 14 of binding 
protein segment 2. This effector polypeptide 20 may consist of. for example, a toxin, therapeutic drug, 
binding protein, enzyme or enzyme fragment, site of attachment for an imaging agent (e.g., to chelate a 

10 radioactive ion such as indium), or site of selective attachment to an immobilization matrix so that the BABS 
can be used in affinity chromatography or solid phase binding assay. This effector alternatively may be 
linked to the amino terminus of polypeptide 10, although trailers are preferred. Figure 2D depicts a 
Afunctional protein comprising a linked pair of BABS 2 having another distinct protein domain 20 attached 
to the N-terminus of the first binding protein segment. Use of multiple BABS in a single protein enables 

75 production of constructs having very high selective affinity for multiepitopic sites such as cell surface 
proteins. 

The independently functional domains are attached by a spacer 18 (Figs 2B and 2D) covalently linking 
the C terminus of the protein 16 or 20 to the N-terminus of the first domain 10 of the binding protein 
segment 2, or by a spacer 22 linking the Oterminus of the second binding domain 14 to the N-terminus of 

20 another protein (Figs. 2C and 2D). The spacer may be an amino acid sequence analogous to linker 
sequence 12, or it may take other forms. As noted above, the spacer's primary function is to separate the 
active protein regions to promote their independent bioactivrty and permit each region to assume its 
bioactive conformation independent of interference from its neighboring structure. 

Figure 2E depicts another type of reagent, comprising a BABS having only one set of three CDRs, e.g., 

25 analogous to a heavy chain variable region, which retains a measure of affinity for the antigen. Attached to 
the carboxyl end of the polypeptide 10 or 14 comprising the FR and CDR sequences constituting the 
binding site 3 through spacer 22 is effector polypeptide 20 as described above. 

As is evidenced from the foregoing, the invention provides a large family of reagents comprising 
proteins, at least a portion of which defines a binding site patterned after the variable region of an 

30 immunoglobulin. It will be apparent that the nature of any protein fragments linked to the BABS, and used 
for reagents embodying the invention, are essentially unlimited, the essence of the invention being the 
provision, either alone or linked to other proteins, of binding sites having specificities to any antigen desired. 

The clinical administration of multifunctional proteins comprising a BABS, or a BABS alone, affords a 
number of advantages over the use of intact natural or chimeric antibody molecules, fragments thereof, and 

35 conjugates comprising such antibodies linked chemically to a second bioactive moiety. The multifunctional 
proteins described herein offer fewer cleavage sites to circulating proteolytic enzymes, their functional 
domains are connected by peptide bonds to polypeptide linker or spacer sequences, and thus the proteins 
have improved stability. Because of their smaller size and efficient design, the multifunctional proteins 
described herein reach their target tissue more rapidly, and are cleared more quickly from the body. They 

40 also have reduced immunogenicity. In addition, their design facilitates coupling to other moieties in drug 
targeting and imaging application. Such coupling may be conducted chemically after expression of the 
BABS to a site of attachment for the coupling product engineered into the protein at the DNA level. Active 
effector proteins having toxic, enzymatic, binding, modulating, cell differentiating, hormonal, or other 
bioactivity are expressed from a single DNA as a leader and/or trailer sequence, peptide bonded to the 

45 BABS. 

Design and Manufacture 

The proteins of the invention are designed at the DNA level. The chimeric or synthetic DNAs are then 
so expressed in a suitable host system, and the expressed proteins are collected and renatured if necessary. A 
preferred general structure of the DNA encoding the proteins is set forth in Figure 8. As illustrated, it 
encodes an optimal leader sequence used to promote expression in procaryotes having a built-in cleavage 
site recognizable by a site specific cleavage agent, for example, an endopeptidase, used to remove the 
leader after expression. This is followed by DNA encoding a VVIike domain, comprising CDRs and FRs, a 
55 linker, a V L -like domain, again comprising CDRs and FRs, a spacer, and an effector protein. After 
expression, folding, and cleavage of the leader, a bifunctional protein is produced having a binding region 
whose specificity is determined by the CDRs, and a peptide-linked independently functional effector region. 
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The ability to design the BABS of the invention depends on the ability to determine the sequence of the 
amino acids in the variable region of monoclonal antibodies of interest, or the DNA encoding them. 
Hybridoma technology enables production of cell lines secreting antibody to essentially any desired 
substance that produces an immune response. RNA encoding the light and heavy chains of the im- 

5 munoglobulin can then be obtained from the cytoplasm of the hybridoma. The 5' end portion of the mRNA 
can be used to prepare cDNA for subsequent sequencing, or the amino acid sequence of the hypervariable 
and flanking framework regions can be determined by amino acid sequencing of the V region fragments of 
the H and L chains. Such sequence analysis is now conducted routinely. This knowledge, coupled with 
observations and deductions of the generalized structure of immunoglobulin Fvs, permits one to design 

m synthetic genes encoding FR and CDR sequences which likely will bind the antigen. These synthetic genes 
are then prepared using known techniques, or using the technique disclosed below, inserted into a suitable 
host, and expressed, and the expressed protein is purified. Depending on the host cell, renaturation 
techniques may be required to attain proper conformation. The various proteins are then tested for binding 
ability, and one having appropriate affinity is selected for incorporation into a reagent of the type described 

75 above. If necessary, point substitutions seeking to optimize binding may be made in the DNA using 
conventional casette mutagenesis or other protein engineering methodology such as is disclosed below. 

Preparation of the proteins of the invention also is dependent on knowledge of the amino acid sequence 
(or corresponding DNA or RNA sequence) of bioactive proteins such as enzymes, toxins, growth factors, 
cell differentiation factors, receptors, anti-metabolites, hormones or various cytokines or lymphokines. Such 

20 sequences are reported in the literature and available through computerized data banks. 

The DNA sequences of the binding site and the second protein domain are fused using conventional 
techniques, or assembled from synthesized oligonucleotides, and then expressed using equally conven- 
tional techniques. 

The processes for manipulating, amplifying, and recombining DNA which encode amino acid sequences 

25 of interest are generally well known in the art, and therefore, not described in detail herein. Methods of 
identifying and isolating genes encoding antibodies of interest are well understood, and described in the 
patent and other literature, in general, the methods involve selecting genetic material coding for amino acids 
which define the proteins of interest, including the CDRs and FRs of interest, according to the genetic code. 
Accordingly, the construction of DNAs encoding proteins as disclosed herein can be done using known 

30 techniques involving the use of various restriction enzymes which make sequence specific cuts in DNA to 
produce blunt ends or cohesive ends, DNA ligases, techniques enabling enzymatic addition of sticky ends 
to blunt-ended DNA, construction of synthetic DNAs by assembly of short or medium length 
oligonucleotides, cDNA synthesis techniques, and synthetic probes for isolating immunoglobulin or other 
bioactive protein genes. Various promoter sequences and other regulatory DNA sequences used in 

35 achieving expression, and various types of host cells are also known and available. Conventional transec- 
tion techniques, and equally conventional techniques for cloning and subcloning DNA are useful in the 
practice of this invention and known to those skilled in the art. Various types of vectors may be used such 
as plasmids and viruses including animal viruses and bacteriophages. The vectors may exploit various 
marker genes which impart to a successfully transfected cell a detectable phenotypic property that can be 

40 used to identify which of a family of clones has successfully incorporated the recombinant DNA of the 
vector. 

One method for obtaining DNA encoding the proteins disclosed herein is by assembly of synthetic 
oligonucleotides produced in a conventional, automated, polynucleotide synthesizer followed by ligation with 
appropriate ligases. For example, overlapping, complementary DNA fragments comprising 15 bases may be 

45 synthesized semi manually using phosphoramidite chemistry, with end segments left unphosphorylated to 
prevent polymerization during ligation. One end of the synthetic DNA is left with a "sticky end" correspond- 
ing to the site of action of a. particular restriction endonuclease. and the other end is left with an end 
corresponding to the site of action of another restriction endonuclease. Alternatively, this approach can be 
fully automated. The DNA encoding the protein may be created by synthesizing longer single strand 

so fragments (e.g., 50-100 nucleotides long) in. for example, a Biosearch oligonucleotide synthesizer, and then 
ligating the fragments. 

A method of producing the BABS of the invention is to produce a synthetic DNA encoding a 
polypeptide comprising, e.g.. human FRs. and intervening "dummy" CDRs, or amino acids having no 
function except to define suitably situated unique restriction sites. This synthetic DNA is then altered by 
55 DNA replacement, in which restriction and ligation is employed to insert synthetic oligonucleotides encoding 
CDRs defining a desired binding specificity in the proper location between the FRs. This approach 
facilitates empirical refinement of the binding properties of the BABS. 
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This technique is dependent upon the ability to cleave a DNA corresponding in structure to a variable 
domain gene at specific sites flanking nucleotide sequences encoding CDRs. These restriction sites in 
some cases may be found in the native gene. Alternatively, non-native restriction sites may be engineered 
into the nucleotide sequence resulting in a synthetic gene with a different sequence of nucleotides than the 

5 native gene, but encoding the same variable region amino acids because of the degeneracy of the genetic 
code. The fragments resulting from endonuclease digestion, and comprising FR-encoding sequences, are 
then ligated to non-native CDR-encoding sequences to produce a synthetic variable domain gene with 
altered antigen binding specificity. Additional nucleotide sequences encoding, for example, constant region 
amino acids or a bioactive molecule may then be linked to the gene sequences to produce a Afunctional 

10 protein. 

The expression of these synthetic DNA's can be achieved in both prokaryotic and eucaryotic systems 
via transfection with an appropriate vector. In E. coli and other microbial hosts, the synthetic genes can be 
expressed as fusion protein which is subsequently cleaved. Expression in eucaryotes can be accomplished 
by the transfection of DNA sequences encoding CDR and FR region amino acids and the amino acids 

75 defining a second function into a myeloma or other type of cell line. By this strategy intact hybrid antibody 
molecules having hybrid Fv regions and various bioactive proteins including a biosynthetic binding site may 
be produced. For fusion protein expressed in bacteria, subsequent proteolytic cleavage of the isolated 
fusions can be performed to yield free BABS, which can be renatured to obtain an intact biosynthetic, 
hybrid antibody binding site. 

20 Heretofore, it has not been possible to cleave the heavy and light chain region to separate the variable 
and constant regions of an immunoglobulin so as to produce intact Fv, except in specific cases not of 
commercial utility. However, one method of producing BABS in accordance with this invention is to 
redesign DNAs encoding the heavy and light chains of an immunoglobulin, optionally altering its specificity 
or humanizing its FRs, and incorporating a cleavage site and "hinge region" between the variable and 

25 constant regions of both the heavy and light chains. Such chimeric antibodies can be produced in 
transfectomas or the like and subsequently cleaved using a preselected endopeptidase. 

The hinge region is a sequence of amino acids which serve to promote efficient cleavage by a 
preselected cleavage agent at a preselected, built-in cleavage site. It is designed to promote cleavage 
preferentially at the cleavage site when the polypeptide is treated with the cleavage agent in an appropriate 

30 environment. 

The hinge region can take many different forms. Its design involves selection of amino acid residues 
(and a DNA fragment encoding them) which impart to the region of the fused protein about the cleavage 
site an appropriate polarity, charge distribution, and stereochemistry which, in the aqueous environment 
where the cleavage takes place, efficiently exposes the cleavage site to the cleavage agent in preference to 
35 other potential cleavage sites that may be present in the polypeptide, and/or to improve the kinetics of the 
cleavage reaction. In specific cases, the amino acids of the hinge are selected and assembled in sequence 
based on their known properties, and then the fused polypeptide sequence is expressed, tested, and 
altered for refinement. 

The hinge region is free of cysteine. This enables the cleavage reaction to be conducted under 

40 conditions in which the protein assumes its tertiary conformation, and may be held in this conformation by 
intramolecular disulfide bonds. It has been discovered that in these conditions access of the protease to 
potential cleavage sites which may be present within the target protein is hindered. The hinge region may 
comprise an amino acid sequence which includes one or more proline residues. This allows formation of a 
substantially unfolded molecular segment. Aspartic acid, glutamic acid, arginine, lysine, serine, and 

45 threonine residues maximize ionic interactions and may be present in amounts and/or in sequence which 
renders the moiety comprising the hinge water soluble. 

The cleavage site preferably is immediately adjacent the Fv polypeptide chains and comprises one 
amino acid or a sequence of amino acids exclusive of any sequence found in the amino acid structure of 
the chains in the Fv. The cleavage site preferably is designed for unique or preferential cleavage by a 

50 specific selected agent. Endopeptidases are preferred, although non-enzymatic (chemical) cleavage agents 
may be used. Many useful cleavage agents, for instance, cyanogen bromide, dilute acid, trypsin, Staphy- 
lococcus aureus V-8 protease, post proline cleaving enzyme, blood coagulation Factor Xa, enterokinase, 
and renin, recognize and preferentially or exclusively cleave particular cleavage sites. One currently 
preferred cleavage agent is V-8 protease. The currently preferred cleavage site is a Glu residue. Other 

55 useful enzymes recognize multiple residues as a cleavage site, e.g.. factor Xa (lle-Glu-Gly-Arg) or 
enterokinase (Asp-Asp-Asp-Asp-Lys). The principles of this selective cleavage approach may also be used 
in the design of the linker and spacer sequences of the multifunctional constructs of the invention where an 
exciseable linker or selectively cleavable linker or spacer is desired. 
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Design of Synthetic V H and V L Mimics 

FRs from the heavy and light chain murine anti-digoxin monoclonal 26-10 (Figures 4A and 4B) were 
encoded on the same DNAs with CDRs from the murine anti-lysozyme monoclonal glp-4 heavy chain 

5 (Figure 3 sequence 1) and light chain to produce V H (Figure 4C) and V L (Figure 4D) regions together 
defining a biosynthetic antibody binding site which is specific for lysozyme. Murine CDRs from both the 
heavy and light chains of monoclonal glp-4 were encoded on the same DNAs with FRs from the heavy and 
light chains of human myeloma antibody NEWM (Figures 4E and 4F). The resulting interspecies chimeric 
antibody binding domain has reduced immunogenicity in humans because of its human FRs, and specificity 

w for lysozyme because of its murine CDRs. 

A synthetic DNA was designed to facilitate CDR insertions into a human heavy chain FR and to 
facilitate empirical refinement of the resulting chimeric amino acid sequence. This DNA is depicted in 
Figure 5. 

A synthetic, Afunctional FB-bindtng site protein was also designed at the DNA level, expressed, 
75 purified, renatured, and shown to bind specifically with a preselected antigen (digoxin) and Fc. The detailed 
primary structure of this construct is shown in Figure 6; its tertiary structure is illustrated schematically in 
Figure 2B. 

Details of these and other experiments, and additional design principles on which the invention is 
based, are set forth below. 

20 

GENE DESIGN AND EXPRESSION 

Given known variable region DNA sequences, synthetic V L and V H genes may be designed which 
encode native or near native FR and CDR amino acid sequences from an antibody molecule, each 

25 separated by unique restriction sites located as close to FR-CDR and CDR-FR borders as possible. 
Alternatively, genes may be designed which encode native FR sequences which are similar or identical to 
the FRs of an antibody molecule from a selected species, each separated by "dummy" CDR sequences 
containing strategically located restriction sites. These DNAs serve as starting materials for producing 
BABS, as the native or "dummy" CDR sequences may be excised and replaced with sequences encoding 

30 the CDR amino acids defining a selected binding site. Alternatively, one may design and directly synthesize 
native or near-native FR sequences from a first antibody molecule, and CDR sequences from a second 
antibody molecule. Any one of the V H and V L sequences described above may be linked together directly, 
via an amino acids chain or linker connecting the C-terminus of one chain with the N-terminus of the other. 
These genes, once synthesized, may be cloned with or without additional DNA sequences coding for, 

35 e.g., an antibody constant region, enzyme, or toxin, or a leader peptide which facilitates secretion or 
intracellular stability of a fusion polypeptide. The genes then can be expressed directly in an appropriate 
host cell, or can be further engineered before expression by the exchange of FR, CDR, or "dummy" CDR 
sequences with new sequences. This manipulation is facilitated by the presence of the restriction sites 
which have been engineered into the gene at the FR-CDR and CDR-FR borders. 

40 Figure 3 illustrates the general approach to designing a chimeric V H ; further details of exemplary 
designs at the DNA level are shown in Figures 4A-4F. Figure 3, lines 1 and 2, show the amino acid 
sequences of the heavy chain variable region of the murine monoclones glp-4 (anti-lysozyme) and 26-10 
(anti-digoxin), including the four FR and three CDR sequences of each. Line 3 shows the sequence of a 
chimeric V H which comprises 26-10 FRs and glp-4 CDRs. As illustrated, the hybrid protein of line 3 is 

45 identical to the native protein of line 2, except that 1) the sequence TFTNYYIHWLK has replaced the 
sequence IFTDFYMNVWR, 2) EWIG WIYPG NGNTKYN EN FKG has replaced DYIGYISPYSGVTGYNQKFKG, 
3) RYTHYYF has replaced GSSGNKWAM. and 4) A has replaced V as the sixth amino acid beyond CDR-2. 
These changes have the effect of changing the specificity of the 26-10 V H to mimic the specificity of glp-4. 
The Ala to Val single amino acid replacement within the relatively conserved framework region of 26-10 is. 

so an example of the replacement of an amino acid outside the hypervariable region made for the purpose of 
altering specificity by CDR replacement. Beneath sequence 3 of Figure 3, the restriction sites in the DNA 
encoding the chimeric V H (see Figures 4A-4F) are shown which are disposed about the CDR-FR borders. 

Lines 4 and 5 of Figure 3 represent another construct. Line 4 is the full length V H of the human antibody 
NEWM. That human antibody may be made specific for lysozyme by CDR replacement as shown in line 5. 

55 Thus, for example, the segment TFTNYYIHWLK from glp-4 replaces TFSNDYYTWVR of NEWM, and its 
other CDRs are replaced as shown. This results in a V H comprising a human framework with murine 
sequences determining specificity. 
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By sequencing any antibody, or obtaining the sequence from the literature, in view of this disclosure 
one skilled in the art can produce a BABS of any desired specificity comprising any desired framework 
region. Diagrams such as Figure 3 comparing the amino acid sequence are valuable in suggesting which 
particular amino acids should be replaced to determine the desired complementarity. Expressed sequences 

5 may be tested for binding and refined by exchanging selected amino acids in relatively conserved regions, 
based on observation of trends in amino acid sequence data and/or computer modeling techniques. 

Significant flexibility in V„ and V L design is possible because the amino acid sequences are determined 
at the DNA level, and the manipulation of DNA can be accomplished easily. 

For example, the DNA sequence for murine V H and V L 26-10 containing specific restriction sites flanking 

w each of the three CDRs was designed with the aid of a commercially available computer program which 
performs combined reverse translation and restriction site searches ("RV.exe" by Compugene, Inc.). The 
known amino acid sequences for V H and V L 26-10 polypeptides were entered, and all potential DNA 
sequences which encode those peptides and all potential restriction sites were analyzed by the program. 
The program can, in addition, select DNA sequences encoding the peptide using only codons preferred by 

15 E. coli if this bacterium is to be host expression organism of choice. Figures 4A and 4B show an example of 
program output. The nucelic acid sequences of the synthetic gene and the corresponding amino acids are 
shown. Sites of restriction endonuclease cleavage are also indicated. The CDRs of these synthetic genes 
are underlined. 

The DNA sequences for the synthetic 26-10 V H and V L are designed so that one or both of the 
20 restriction sites flanking each of the three CDRs are unique. A six base site (such as that recognized by 
Bsm I or BspM I) is preferred, but where six base sites are not possible, four or five base sites are used. 
These sites, if not already unique, are rendered unique within the gene by eliminating other occurrences 
within the gene without altering necessary amino acid sequences. Preferred cleavage sites are those that, 
once cleaved, yield fragments with sticky ends just outside of the boundary of the CDR within the 
25 framework. However, such ideal sites are only occasionally possible because the FR-CDR boundary is not 
an absolute one, and because the amino acid sequence of the FR may not permit a restriction site. In these 
cases, flanking sites in the FR which are more distant from the predicted boundary are selected. 

Figure 5 discloses the nucleotide and corresponding amino acid sequence (shown in standard single 
letter code) of a synthetic DNA comprising a master framework gene having the generic structure: 

30 Ri -FR1-X1 -FR2-X2-FR3-X3-FR4-R2 

where Rt and R 2 are restricted ends which are to be ligated into a vector, and X1 , X2, and X3 are DNA 
sequences whose function is to provide convenient restriction sites for CDR insertion. This particular DNA 
has murine FR sequences and unique, 6-base restriction sites adjacent the FR borders so that nucleotide 
sequences encoding CDRs from a desired monoclonal can be inserted easily. Restriction endonuclease 
35 digestion sites are indicated with their abbreviations; enzymes of choice for CDR replacement are 
underscored. Digestion of the gene with the following restriction endonucleases results in 3' and 5* ends 
which can easily be matched up with and ligated to native or synthetic CDRs of desired specificity; Kpnl 
and BstXI are used for ligation of CDRi ; Xbal and Oral for CDR 2 ; and BssHII and Clal for CDR3. 

40 OLIGONUCLEOTIDE SYNTHESIS 

The synthetic genes and DNA fragments designed as described above preferably are produced by 
assembly of chemically synthesized oligonucleotides. 15-100mer oligonucleotides may be synthesized on a 
Biosearch DNA Model 8600 Synthesizer, and purified by polyacrylamide gel electrophoresis (PAGE) in Tris- 
45 Borate-EDTA buffer (TBE). The DNA is then electroeluted from the gel. Overlapping oligomers may be 
phosphorylated by T4 polynucleotide kinase and ligated into larger blocks which may also be purified by 
PAGE. 

CLONING OF SYNTHETIC OLIGONUCLEOTIDES 

50 

The blocks or the pairs of longer oligonucleotides may be cloned into E. coli using a suitable, e.g., pUC, 
cloning vector. Initially, this vector may be altered by single strand mutagenesis to eliminate residual six 
base altered sites. For example, V H may be synthesized and cloned into pUC as five primary blocks 
spanning the following restriction sites: 1. EcoRI to first Narl site; 2. first Narl to Xbal; 3. Xbal to Sail; 4. Sail 
55 to Ncol; 5. Ncol to BamHI. These cloned fragments may then be isolated and assembled in several three- 
fragment ligations and cloning steps into the pUC8 plasmid. Desired ligations selected by PAGE are then 
transformed into, for example, E. coli strain JM83, and plated onto LB Ampicillin + Xgal plates according to 
standard procedures. The gene sequence may be confirmed by supercoil sequencing after cloning, or after 
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subcloning into M13 via the dideoxy method of Sanger. 
PRINCIPLE OF CDR EXCHANGE 

5 Three CDRs (or alternatively, four FRs) can be replaced per V H or V L . In simple cases* this can be 
accomplished by cutting the shuttle pUC plasmid containing the respective genes at the two unique 
restriction sites flanking each CDR or FR, removing the excised sequence, and ligating the vector with a 
native nucleic acid sequence or a synthetic oligonucleotide encoding the desired CDR or FR. This three 
part procedure would have to be repeated three times for total CDR replacement and four times for total FR 

10 replacement. Alternatively, a synthetic nucleotide encoding two consecutive CDRs separated by the 
appropriate FR can be ligated to a pUC or other plasmid containing a gene whose corresponding CDRs and 
FR have been cleaved out. This procedure reduces the number of steps required to perform CDR and/or 
FR exchange. 

75 EXPRESSION OF PROTEINS 

The engineered genes can be expressed in appropriate prokaryotic hosts such as various strains of E. 
coli , and in eucaryotic hosts such as Chinese hamster ovary cell, murine myeloma, and human 
myeloma/transfectoma cells. 

20 For example, if the gene is to be expressed in E. coli, it may first be cloned into an expression vector. 
This is accomplished by positioning the engineered gene downstream from a promoter sequence such as 
trp or tac. and a gene coding for a leader peptide. The resulting expressed fusion protein accumulates in 
retractile bodies in the cytoplasm of the cells, and may be harvested after disruption of the cells by French 
press or sonication. The retractile bodies are solubiiized, and the expressed proteins refolded and cleaved 

25 by the methods already established for many other recombinant proteins. 

If the engineered gene is to be expressed in myeloma cells, the conventional expression system for 
immunoglobulins, it is first inserted into an expression vector containing, for example, the Ig promoter, a 
secretion signal, immunoglobulin enhancers, and various introns. This plasmid may also contain sequences 
encoding all or part of a constant region, enabling an entire part of a heavy or light chain to be expressed. 

30 The gene is transfected into myeloma cells via established electroporation or protoplast fusion methods. 
Cells so transfected can express V L or V H fragments, or Vh2 homodimers, V L -V H heterodimers, Vh-V l or 
V L -V H single chain polypeptides, complete heavy or light immunoglobulin chains, or portions thereof, each 
of which may be attached in the various ways discussed above to a protein region having another function 
(e.g., cytotoxicity). 

35 Vectors containing a heavy chain V region (or V and C regions) can be cotransfected with analogous 

vectors carrying a light chain V region (or V and C regions), allowing for the expression of noncovalently 

associated binding sites (or complete antibody molecules). 

In the examples which follow, a specific example of how to make a single chain binding site is 

disclosed, together with methods employed to assess its binding properties. Thereafter, a protein construct 
40 having two functional domains is disclosed. Lastly, there is disclosed a series of additional targeted proteins 

which exemplify the invention. 

I EXAMPLE OF CDR EXCHANGE AND EXPRESSION 

45 The synthetic gene coding for murine V H and V L 26-10 shown in Figures 4A and 4B were designed 
from the known amino acid sequence of the protein with the aid of Compugene, a software program. These 
genes, although coding for the native amino acid sequences, also contain non-native and often unique 
restriction sites flanking nucleic acid sequences encoding CDR's to facilitate CDR replacement as noted 
above. 

50 Both the 3' and 5* ends of the large synthetic oligomers were designed to include 6-base restriction 
sites, present in the genes and the pUC vector. Furthermore, those restriction sites in the synthetic genes 
which were only suited for assembly but not for cloning the pUC were extended by "helper" cloning sites 
with matching sites in pUC. 

Cloning of the synthetic DNA and later assembly of the gene is facilitated by the spacing of unique 

55 restriction sites along the gene. This allows corrections and modifications by cassette mutagenesis at any 
location. Among them are alterations near the 5' or 3' ends of the gene as needed for the adaptation to 
different expression vectors. For example, a Pstl site is positioned near the 5* end of the V H gene. Synthetic 
linkers can be attached easily between this site and a restriction site in the expression plasmid. These 
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genes were synthesized by assembling oligonucleotides as described above using a Biosearch Model 8600 

DNA Synthesizer. They were ligated to vector pUC8 for transformation of E. coli. 

Specific CDRs may be cleaved from the synthetic V H gene by digestion with the following pairs of 

restriction endonucleases: HpHI and BstXI for CDRi; Xbal and Dral for CDR2; and Banll and Banl for CDR3. 
5 After removal on one CDR, another CDR of desired specificity may be ligated directly into the restricted 

gene, in its place if the 3* and 5' ends of the restricted gene and the new CDR contain complementary 

single stranded DNA sequences. 

In the present example, the three CDRs of each of murine V H 26-10 and V L 26-10 were replaced with 

the corresponding CDRs of glp-4. The nucleic acid sequences and corresponding amino acid sequences of 
70 the chimeric V H and V L genes encoding the FRs of 26-10 and CDRs of glp-4 are shown in Figures 4C and 

4D. The positions of the restriction endonuclease cleavage sites are noted with their standard abbreviations. 

CDR sequences are underlined as are the restriction endonucleases of choice useful for further CDR 

replacement. 

These genes were cloned into pUC8, a shuttle plasmid. To retain unique restriction sites after cloning, 

75 the VVIike gene was spliced into the EcoR1 and Hind I II or Bam HI sites of the plasmid. 

Direct expression of the genes may be achieved in E. coli. Alternatively, the gene may be preceded by 
a leader sequence and expressed in E. coli as a fusion product by splicing the fusion gene into the host 
gene whose expression is regulated by interaction of a repressor with the respective operator. The protein 
can be induced by starvation in minimal medium and by chemical inducers. The VVV L biosynthetic 26-10 

20 gene has been expressed as such a fusion protein behind the trp and tac promoters. The gene translation 
product of interest may then be cleaved from the leader in the fusion protein by e.g., cyanogen bromide 
degradation, tryptic digestion, mild acid cleavage, and/or digestion with factor Xa protease. Therefore, a 
shuttle plasmid containing a synthetic gene encoding a leader peptide having a site for mild acid cleavage, 
and into which has been spliced the synthetic BABS gene was used for this purpose. In addition, synthetic 

25 DNA sequences encoding a signal peptide for secretion of the processed target protein into the periplasm 
of the host cell can also be incorporated into the plasmid. 

After harvesting the gene product and optionally releasing it from a fusion peptide, its activity as an 
antibody binding site and its specificity for glp-4 (lysozyme) epitope are assayed by established im- 
munological techniques, e.g., affinity chromatography and radioimmunoassay. Correct folding of the protein 

30 to yield the proper three-dimensional conformation of the antibody binding site is prerequisite for its activity. 
This occurs spontaneously in a host such as a myeloma cell which naturally expresses immunoglobulin 
proteins. Alternatively, for bacterial expression, the protein forms inclusion bodies which, after harvesting, 
must be subjected to a specific sequence of solvent conditions (e.g., diluted 20 X from 8 M urea 0.1 M Tris- 
HCI pH 9 into 0.15 M NaCI, 0.01 M sodium phosphate, pH 7.4 (Hochman et al. (1976) Biochem. 15:2706- 

35 2710) to assume its correct conformation and hence its active form. 

Figures 4E and 4F show the DNA and amino acid sequence of chimeric V H and V t comprising human 
FRs from NEWM and murine CDRs from glp-4. The CDRs are underlined, as are restriction sites of choice 
for further CDR replacement or empirically determined refinement. 

These constructs also constitute master framework genes, this time constructed of human framework 

40 sequences. They may be used to construct BABS of any desired specificity by appropriate CDR 
replacement. 

Binding sites with other specificities have also been designed using the methodologies disclosed 
herein. Examples include those having FRs from the human NEWM antibody and CDRs from murine 26-10 
(Figure 9A), murine 26-10 FRs and G-loop CDRs (Figure 9B), FRs and CDRs from murine MOPC-315 
45 (Figure 9C), FRs and CDRs from an anti-human carcinoembryonic antigen monoclonal antibody (Figure 
9D), and FRs and CDRs 1, 2, and 3 from V L and FRs and CDR 1 and 3 from the V H of the anti-CEA 
antibody, with CDR 2 from a consensus immunoglobulin gene (Figure 9E). 

II. Model Binding Site: 

50 

The digoxin binding site of the lgG 2a .ii monoclonal antibody 26-10 has been analyzed by Mudgett- 
Hunter and colleagues (unpublished). The 26-10 V region sequences were determined from both amino 
acid sequencing and DNA sequencing of 26-10 H and L chain mRNA transcripts (D. Panka, J.N. & M.N.M., 
unpublished data). The 26-10 antibody exhibits a high digoxin binding affinity [Ko = 5.4 X 10* M' 1 ] and has 
55 a well-defined specificity profile, providing a baseline for comparison with the biosynthetic binding sites 
mimicking its structure. 
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Protein Design : 

Crystallographicaily determined atomic coordinates for Fab fragments of 26-10 were obtained from the 
Brookhaven Data Bank. Inspection of the available three-dimensional structures of Fv regions within their 

5 parent Fab fragments indicated that the Euclidean distance between the C-terminus of the V H domain and 
the N-terminus of the V L domain is about 35 A. Considering that the peptide unit length is approximately 3.8 
A, a 15 residue linker was selected to bridge this gap. The linker was designed so as to exhibit little 
propensity for secondary structure and not to interfere with domain folding. Thus, the 15 residue sequence 
(Gly-Gly-Gly-Gly-Ser) 3 was selected to connect the V H carboxyl- and V L amino-termini. 

w Binding studies with single chain binding sites having less than or greater than 15 residues demonstrate 
the importance of the prerequisite distance which must separate V H from V L ; for example, a (Gly4-Ser)i 
linker does not demonstrate binding activity, and those with (Glv4-Ser)s linkers exhibit very low activity 
compared to those with (Glyi-Ser^ linkers. 

J5 Gene Synthesis : 

Design of the 744 base sequence for the synthetic binding site gene was derived from the Fv protein 
sequence of 26-10 by choosing codons frequently used in E. coli. The model of this representative 
synthetic gene is shown in Figure 8, discussed previously. Synthetic genes coding for the trj> promoter- 

20 operator, the modified trp LE leader peptide (MLE), the sequence of which is shown in Figure 10A, and V H 
were prepared largely as described previously. The gene coding for V H was assembled from 46 chemically 
synthesized oligonucleotides, all 15 bases long, except for terminal fragments (13 to 19 bases) that included 
cohesive cloning ends. Between 8 and 15 overlapping oligonucleotides were enzymatically ligated into 
double stranded DNA, cut at restriction sites suitable for cloning (Narl, Xbal, Sail, Sacll, Sad), purified by 

25 PAGE on 8% gels, and cloned in pUC which was modified to contain additional cloning sites in the 
polylinker. The cloned segments were assembled stepwise into the complete gene mimicking V H by 
ligations in the pUC cloning vector. 

The gene mimicking 26-10 V L was assembled from 12 long synthetic polynucleotides ranging in size 
from 33 to 88 base pairs, prepared in automated DNA synthesizers (Model 6500, Biosearch, San Rafael, 

30 CA; Model 380A, Applied Biosystems, Foster City, CA). Rve individual double stranded segments were 
made out of pairs of long synthetic oligonucleotides spanning six-base restriction sites in the gene (Aatll, 
BstEII, Ppnl, Hindlll, Bglll, and Pstl). In one case, four long overlapping strands were combined and cloned. 
Gene fragments bounded by restriction sites for assembly that were absent from the pUC polylinker, such 
as Aatll and BstEII, were flanked by EcoRI and BamHI ends to.facilitate cloning. 

35 The linker between V H and V L , encoding (Gly-Gly-Gly-Gly-Ser)3, was cloned from two long synthetic 
oligonucleotides, 54 and 62 bases long, spanning Sacl and Aatll sites, the latter followed by an EcoRI 
cloning end. The complete single chain binding site gene was assembled from the V H , V L , and linker genes 
to produce a construct, corresponding to aspartyl-prolyl-VH-<linker>-V L , flanked by EcoRI and Pstl restriction 
sites. 

40 The trp promoter-operator, starting from its Sspl site, was assembled from 12 overlapping 15 base 
oligomers, and the MLE leader gene was assembled from 24 overlapping 15 base oligomers. These were 
cloned and assembled in pUC using the strategy of assembly sites flanked by cloning sites. The final 
expression plasmid was constructed in the pBR322 vector by a 3-part ligation using the sites Sspl, EcoRI, 
and Pstl (see Figure 10B). Intermediate DNA fragments and assembled genes were sequenced by the 

45 dideoxy method. 

Fusion Protein Expression : 

Single-chain protein was expressed as a fusion protein. The MLE leader gene (Fig. 10A) was derived 
50 from E. coli trp LE sequence and expressed under the control of a synthetic trp promoter and operator. E. 
coli strain JM83 was transformed with the expression plasmid and protein expression was induced in M9 
minimal medium by addition of indoleacrylic acid (10 ug/ml) at a cell density with A* 0 o - 1. The high 
expression levels of the fusion protein resulted in its accumulation as insoluble protein granules, which were 
harvested from cell paste (Figure 1 1 , Lane 1). 
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Fusion Protein Cleavage : 

The MLE leader was removed from the binding site protein by acid cleavage of the Asp-Pro peptide 
bond engineered at the junction of the MLE and binding site sequences. The washed protein granules 
5 containing the fusion protein were cleaved in 6 M guanidine-HCI + 10% acetic acid, pH 2.5, incubated at 
37 # C for 96 hrs. The reaction was stopped through precipitation by addition of a 10-fold excess of ethanol 
with overnight incubation at -20 *C, followed by centrifugation and storage at -20 *C until further purification 
(Figure 11. Lane 2). 

10 Protein Purification : 

The acid cleaved binding site was separated from remaining intact fused protein species by chromatog- 
raphy on DEAE cellulose. The precipitate obtained from the cleavage mixture was redissohved in 6 M 
guanidine-HCI + 0.2 M Tris-HCI, pH 8.2, + 0.1 M 2-mercaptoethanol and dialyzed exhaustively against 6 

75 M urea + 2.5 mM Tris-HCI, pH 7.5, + 1 mM EDTA. 2-Mercaptoethanol was added to a final concentration 
of 0.1 M, the solution was incubated for 2 hrs at room temperature and loaded onto a 2.5 X 45 cm column 
of DEAE cellulose (Whatman DE 52), equilibrated with 6 M urea + 2.5 mM Tris-HCI + 1 mM EDTA, pH 
7.5. The intact fusion protein bound weakly to the DE 52 column such that its elution was retarded relative 
to that of the binding protein. The first protein fractions which eluted from the column after loading and 

20 washing with urea buffer contained BABS protein devoid of intact fusion protein. Later fractions contami- 
nated with some fused protein were pooled, rechromatographed on DE 52, and recovered single chain 
binding protein combined with other purified protein into a single pool (Figure 11, Lane 3). 

Refolding : 

25 

The 26-10 binding site mimic was refolded as follows: the DE 52 pool, disposed in 6 M urea + 2.5 mM 
Tris-HCI + 1 mM EDTA, was adjusted to pH 8 and reduced with 0.1 M 2-mercaptoethanol at 37 * C for 90 
min. This was diluted at least 100-fold with 0.01 M sodium acetate, pH 5.5, to a concentration below 10 
ug/ml and dialyzed at 4 • C for 2 days against acetate buffer. 

30 

Affinity Chromatography : 

Purification of active binding protein by affinity chromatography at 4 * C on a ouabain-amine-Sepharose 
column was performed. The dilute solution of refolded protein was loaded directly onto a pair of tandem 

35 columns, each containing 3 ml of resin equilibrated with the 0.01 M acetate buffer, pH 5.5. The columns 
were washed individually with an excess of the acetate buffer, and then by sequential additions of 5 ml 
each of 1 M NaCI, 20 mM ouabain, and 3 M potassium thiocyanate dissolved in the acetate buffer, 
interspersed with acetate buffer washes. Since digoxin binding activity was still present in the eluate, the 
eluate was pooled and concentrated 20-fold by ultrafiltration (PM 10 membrane, 200 ml concentrator; 

40 Amicon), reapplied to the affinity columns, and eluted as described. Fractions with significant absorbance at 
280 nm were pooled and dialyzed against PBSA or the above acetate buffer. The amounts of protein in the 
DE 52 and ouabain-Sepharose pools were quantitated by amino acid analysis following dialysis against 0.01 
M acetate buffer. The results are shown below in Table 1. 

45 



50 



55 
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TABLE 1 



Estimated Yields of BABS Protein During Purification 


Step 


Wet wt Per 1 


mg protein 


Cleavage yield (%) prior step 


Yield relative to 










fusion protein 


Cell paste 


12.0 g 


1440.0 mg a 






Fusion protein Granules 


2.3 g 


480.0 mg a b 


100.0% 


100.0% 


Acid Cleavage/DE 52 pool 




144.0 mg 


38.0° 


38.0* 


Ouabain-Sepharose pool 




18.1 mg 


12.6* 


4.7* 



determined by Lowry protein analysis 
b Determined by absorbance measurements 
determined by amino acid analysis 
is Calculated from the amount of BABS protein specifically eluted from ouabain-Sepharose relative 
to that applied to the resin; values were determined by amino acid analysis 
e Percerrtage yield calculated on a molar basis 



Sequence Analysis of Gene and Protein : 

The complete gene was sequenced in both directions using the dideoxy method of Sanger which 
confirmed the gene was correctly assembled. The protein sequence was also verified by protein sequen- 
cing. Automated Edman degradation was conducted on intact protein (residues 1-40), as well as on two 
major CNBr fragments (residues 108-129 and 140-159) with a Model 470A gas phase sequencer equipped 
with a Model 120A on-line phenyithiohydantoin-amino acid analyzer (Applied Biosystems, Foster City. CA). 
Homogeneous binding protein fractionated by SDS-PAGE and eluted from gel strips with water, was treated 
with a 20,000-fold excess of CNBr, in 1% trifluoroacetic acid-acetonitrile (1:1), for 12 hrs at 25* On the 
dark). The resulting fragments were separated by SDS-PAGE and transferred electrophoretically onto an 
Immobilon membrane (Millipore, Bedford, MA), from which stained bands were cut out and sequenced. 

Specificity Determination : 



Specificities of anti-digoxin 26r10 Fab and the BABS were assessed by radioimmunoassay. Wells of 
microtiter plates were coated with affinity-purified goat anti-murine Fab fragment (ICN ImmunoBiologicals, 
Lisle, IL) at 10 ug/ml in PBSA overnight at 4*C. After the plates were washed and blocked with 1% horse 
serum in PBSA, solutions (50 ul) containing 26-10 Fab or the BABS in either PBSA or 0.01 M sodium 
acetate at pH 5.5 were added to the wells arid incubated 2-3 hrs at room temperature. After unbound 
antibody fragment was washed from the wells, 25 u I of a series of concentrations of cardiac glycosides 
(10~ 4 to 10~ n M in PBSA) were added. The cardiac glycosides tested included digoxin, digitoxin, 
digoxigenin, digitoxigenin, gitoxin, ouabain, and acetyl strophanthidin. After the addition of 125 l-digoxin (25 
ul, 50,000 cpm; Cambridge Diagnostics, Billerica, MA) to each well, the plates were incubated overnight at 
4"C, washed and counted. The inhibition curves are plotted in Figure 12. The relative affinities for each 
digoxin analogue were calculated by dividing the concentration of each analogue at 50% inhibition by the 
concentration of digoxin (or digoxigenin) that gave 50% inhibition. There is a displacement of inhibition 
curves for the BABS to lower glycoside concentrations than observed for 26-10 Fab, because less active 
BABS than 26-10 Fab was bound to the plate. When 0.25 M urea was added to the BABS in 0.01 M sodium 
acetate, pH 5.5, more active sFv was bound to the goat anti-murine Fab coating on the plate. This caused 
the BABS inhibition curves to shift toward higher glycoside concentrations, closer to the position of those for 
26-10 Fab, although maintaining the relative positions of curves for sFv obtained in acetate buffer alone. 
The results, expressed as normalized concentration of inhibitor giving 50% inhibition of ,25 Wigoxin binding, 
are shown in Table 2. 



55 
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TABLE 2 



26-10 Antibody Species 


Normalizing Glycoside 


D 


DG 


DO 


DOG 


A-S 


G 


O 


Fab 


Digoxin 


1.0 


1.2 


0.9 


1.0 


1.3 


9.6 


15 




Digoxigenin 


0.9 


1.0 


0.8 


0.9 


1.1 


8.1 


13 


BABS 


Digoxin 


1.0 


7.3 


2.0 


2.6 


5.9 


62 


150 




Digoxigenin 


0.1 


1.0 


0.3 


0.4 


0.8 


8.5 


21 



70 



15 



D = Digoxin 

DG = Digoxigenin 

DO = Digitoxin 

DOG = Digitoxigenin 

A-S = Acetyl Strophanthidin 

G = Gitoxin 

O = Ouabain 



20 



25 



30 



35 



40 



Affinity Determination : 

Association constants were measured by equilibrium binding studies. In immunoprecipitation experi- 
ments, 100 ul of 3 H-digoxin (New England Nuclear. Billerica, MA) at a series of concentrations (1GT 7 M to 
10"" M) were added to 100 ul of 26-10 Fab or the BABS at a fixed concentration. After 2-3 hrs of 
incubation at room temperature, the protein was precipitated by the addition of 100 ul goat antiserum to 
murine Fab fragment (ICN Immuno-Biologicals), 50 ul of the IgG fraction of rabbit anti-goat IgG (ICN 
ImmunoBiologicals), and 50 ul of a 10% suspension of protein A-Sepharose (Sigma). Following 2 hrs at 
4 * C, bound and free antigen were separated by vacuum filtration on glass fiber filters (Vacuum Filtration 
Manifold, Millipore, Bedford, MA). Filter disks were then counted in 5 ml of scintillation fluid with a Model 
1500 Tri-Carb Liquid Scintillation Analyzer (Packard, Sterling, VA). The association constants, Ko, were 
calculated from Scatchard analyses of the untransformed radioligand binding data using LIGAND, a non- 
linear curve fitting program based on mass action. KoS were also calculated by Sips plots and binding 
isotherms shown in Figure 13A for the BABS and 13B for the Fab. For binding isotherms, data are plotted 
as the concentration of digoxin bound versus the log of the unbound digoxin concentration, and the 
dissociation constant is estimated from the ligand concentration at 50% saturation. These binding data are 
also plotted in linear form as Sips plots (inset), having the same abscissa as the binding isotherm but with 
the ordinate representing log r/(n-r), defined below. The average intrinsic association constant (Ko) was 
calculated from the modified Sips equation (39), log (r/n-r) = a log C - a log Ko, where r equals moles of 
digoxin bound per mole of antibody at an unbound digoxin concentration equal to C; n is the number of 
moles of digoxin bound at saturation of the antibody binding site, and a is an index of heterogeneity which 
describes the distribution of association constants about the average intrinsic association constant Ko. Least 
squares linear regression analysis of the data indicated correlation coefficients for the lines obtained were 
0.96 for the BABS and 0.99 for 26-10 Fab. A summary of the calculated association constants are shown 
below in Table 3. 



TABLE 3 



Method of Data Analysis 


Association Constant, Ko 


Ko (BABS), M~' 


Ko (Fab), M" 1 


Scatchard plot 
Sips plot 
Binding isotherm 


2.6 X10 7 
5.2 X10 7 


(LOtO^XIO 8 
1.8X10 8 
3.3 X 10 s 
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III. Synthesis of a Multifunctional Protein 

A nucleic acid sequence encoding the single chain binding site described above was fused with a 
sequence encoding the FB fragment of protein A as a leader to function as a second active region. As a 
5 spacer, the native amino acids comprising the last 11 amino acids of the FB fragment bonded to an Asp- 
Pro dilute acid cleavage site was employed. The FB binding domain of the FB consists of the immediately 
preceding 43 amino adds which assume a helicaJ configuration (see Fig. 2B). 

The gene fragments are synthesized using a Biosearch DNA Model 8600 Synthesizer as described 
above. Synthetic oligonucleotides are cloned according to established protocol described above using the 
io pUC8 vector transfected into E. coli. The completed fused gene set forth in Figure 6A is then expressed in 
E.coli. 

After sonication, inclusion bodies were collected by centrifugation, and dissolved in 6 M guanidine 
hydrochloride (GuHCI), 0.2 M Tris. and 0.1 M 2-mercaptoethanol (BME), pH 8.2. The protein was denatured 
and reduced in the solvent overnight at room temperature. Size exclusion chromatography was used to 

75 purify fusion protein from the inclusion bodies. A Sepharose 4B column (1 .5 X 80 cm) was run in a solvent 
of 6 M GuHCI and 0.01 M NaOAc, pH 4.75. The protein solution was applied to the column at room 
temperature in 0.5-1 .0 ml amounts. Fractions were collected and precipitated with cold ethanol. These were 
run on SDS gels, and fractions rich in the recombinant protein (approximately 34,000 D) were pooled. This 
offers a simple first step for cleaning up inclusion body preparations without suffering significant proteolytic 

20 degradation. 

For refolding, the protein was dialyzed against 100 ml of the same GuHCI-Tris-BME solution, and 
dialysate was diluted 11 -fold over two days to 0.55 M GuHCI. 0.01 M Tris, and 0.01 M BME. The dialysis 
sacks were then transferred to 0.01 M NaCI, and the protein was dialyzed exhaustively before being 
assayed by RIA's for binding of 125 l-labelled digoxin. The refolding procedure can be simplified by making a 
25 rapid dilution with water to reduce the GuHCI concentration to 1 .1 M, and then dialyzing against phosphate 
buffered saline (0.15 M NaCI, 0.05 M potassium phosphate, pH 7, containing 0.03% NaN 3 ), so that it is free 
of any GuHCI within 12 hours. Product of both types of preparation showed binding activity, as indicated in 
Figure 7A. 

30 Demonstration of Bifunctionality : 

This protein with an FB leader and a fused BABS is Afunctional; the BABS can bind the antigen and 
the FB can bind the Fc regions of immunoglobulins. To demonstrate this dual and simulataneous activity 
several radioimmunoassays were performed. 

35 Properties of the binding side were probed by a modification of an assay developed by Mudgett-Hunter 
et al. (J. Immunol. (1982) 129:1165-1172; Molec. Immunol. (1985) 22:477-488), so that it could be run on 
microtiter plates as a solid phase sandwich assay. Binding data were collected using goat anti-m urine Fab 
antisera (g Am Fab) as the primary antibody that initially coats the wells of the plate. These are polyclonal 
antisera which recognize epitopes that appear to reside mostly on framework regions. The samples of 

40 interest are next added to the coated wells and incubated with the gAmFab, which binds species that 
exhibit appropriate antigenic sites. After washing away unbound protein, the wells are exposed to 125 1- 
labelled (radioiodinated) digoxin conjugates, either as 125 l-dig-BSA or 125 l-dig-lysine. 

The data are plotted in Figure 7A, which shows the results of a dilution curve experiment in which the 
parent 26-10 antibody was included as a control. The sites were probed with 125 l-dig-BSA as described 

45 above, with a series of dilutions prepared from initial stock solutions, including both the slowly refolded (1) 
and fast diluted/quickly refolded (2) single chain proteins. The parallelism between all three dilution curves 
indicates that gAmFab binding regions on the BABS molecule are essentially the same as on the Fv of 
authentic 26-10 antibody, i.e., the surface epitopes appear to be the same for both proteins. 

The sensitivity of these assays is such that binding affinity of the Fv for digoxin must be at least 10 6 . 

so Experimental data on digoxin binding yielded binding constants in the range of 10 8 to 10 9 M" 1 . The parent 
26-10 antibody has an affinity of 5.4 X 10 9 M~\ Inhibition assays also indicate the binding of ,2S l-dig-lysine, 
and can be inhibited by un labelled digoxin, digoxigenin, digitoxin, digitoxigenin, gitoxin, acetyl strophan- 
thidin, and ouabain in a way largely parallel to the parent 26-10 Fab. This indicates that the specificity of the 
biosynthetic protein is substantially identical to the original monoclonal. 

55 In a second type of assay, Digoxin-BSA is used to coat microtiter plates. Renatured BABS (FB-BABS) 
is added to the coated plates so that only molecules that have a competent binding site can stick to the 
plate. 12S l-labelled rabbit IgG (radioligand) is mixed with bound FB-BABS on the plates. Bound radioactivity 
reflects the interation of IgG with the FB domain of the BABS. and the specificity of this binding is 
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demonstrated by its inhibition with increasing amounts of FB, Protein A, rabbit IgG, lgG2a, and lgG1, as 
shown in Figure 7B. 

The following species were tested in order to demonstrate authentic binding: unlabelled rabbit IgG and 
lgG2a monoclonal antibody (which binds competiviely to the FB domain of the BABS); and protein A and 
5 FB (which bind competively to the radioligand). As shown in Figure 7B. these species are found to 
completely inhibit radioligand binding, as expected. A monoclonal antibody of the lgG1 subclass binds 
poorly to the FB, as expected, inhibiting only about 34% of the radioligand from binding. These data 
indicate that the BABS domain and the FB domain have independent activity. 

10 IV. OTHER CONSTRUCTS 

Other BABS-containing protein constructed according to the. invention expressible in E. coli and other 
host cells as described above are set forth in the drawing. These proteins may be Afunctional or 
multifunctional. Each construct includes a single chain BABS linked via a spacer sequence to an effector 

75 molecule comprising amino acids encoding a biologically active effector protein such as an enzyme, 
receptor, toxin, or growth factor. Some examples of such constructs shown in the drawing include proteins 
comprising epidermal growth factor (EGF) (Figure 15A), streptavidin (Rgure 15B), tumor necrosis factor 
(TNF) (Rgure 15C), calmodulin (Rgure 15D) the beta chain of platelet derived growth factor (B-PDGF) (15E) 
ricin A (15F), interleukin 2 (15G) and FB dimer (15H). Each is used as a trailer and is connected to a 

20 preselected BABS via a spacer (Gly-Ser-Gly) encoded by DNA defining a BamHI restriction site. Additional 
amino acids may be added to the spacer for empirical refinement of the construct if necessary by opening 
up the Bam HI site and inserting an oligonucleotide of a desired length having BamHI sticky ends. Each 
gene also terminates with a Pstl site to facilitate insertion into a suitable expression vector. 

The BABS of the EGF and PDGF constructs may be, for example, specific for fibrin so that the EGF or 

25 PDGF is delivered to the site of a wound. The BABS for TNF and ricin A may be specific to a tumor 
antigen, e.g., CEA, to produce a construct useful in cancer therapy. The calmodulin construct binds 
radioactive ions and other metal ions. Its BABS may be specific, for example, to fibrin or a tumor antigen, 
so that it can be used as an imaging agent to locate a thrombus or tumor. The streptavadin construct binds 
with biotin with very high affinity. The biotin may be labeled with a remotely detectable ion for imaging 

30 purposes. Alternatively, the biotin may be immobilized on an affinity matrix or solid support. The BABS- 
streptavidin protein could then be bound to the matrix or support for affinity chromatography or solid phase 
immunoassay. The interleukin-2 construct could be linked, for example, to a BABS specific for a T-cell 
surface antigen. The FB-FB dimer binds to Fc, and could be used with a BABS in an immunoassay or 
affinity purification procedure linked to a solid phase through immobilized immunoglobulin. 

35 Rgure 14 exemplifies a multifunctional protein having an effector segment as a leader. It comprises an 
FB-FB dimer linked through its C-terminal via an Asp-Pro dipeptide to a BABS of choice. It functions in a 
way very similar to the construct of Fig. 15H. The dimer binds avidly to the Fc portion of immunoglobulin. 
This type of construct can accordingly also be used in affinity chromatography, solid phase immunoassay, 
and in therapeutic contexts where coupling of immunoglobulins to another epitope is desired. 

40 In view of the foregoing, it should be apparent that the invention is unlimited with respect to the specific 
types of BABS and effector proteins to be linked. Accordingly, other embodiments are within the following 
claims. 

The invention covers a single chain multifunctional biosynthetic protein expressed from a single gene 
derived by recombinant DNA techniques, said protein comprising: 

45 a biosynthetic antibody binding site capable of binding to a preselected antigenic determinant and 
comprising at least one protein domain, the amino acid sequence of said domain being homologous to at 
least a portion of the sequence of a variable region of an immunoglobulin molecule capable of binding said 
preselected antigenic determinant; and, peptide bonded to the N or C terminus thereof, 

a polypeptide selected from the group consisting of effector proteins having a conformation suitable for 

so biological activity in mammals, amino acid sequences capable of sequestering an ion, and amino acid 
sequences capable of selective binding to a solid support. The binding site may comprise at least two 
domains connected by peptide bonds to a polypeptide linker, and the two domains mimic a V H and a V L 
from a natural immunoglobulin. 

The amino acid sequence of each of said domains may comprise a set of CDRs interposed between a 

55 set of FRs. each of which is respectively homologous with at least a portion of CDRs and FRs from a said 
variable region of an immunoglobulin molecule capable of binding said preselected antigenic determinant. 
At least one of the domains may comprise a set of CDRs homologous to a portion of the CDRs in a first 
immunoglobulin and a set of FRs homologous to a portion of the FRs in a second, distinct immunoglobulin. 
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The polypeptide linker may span a distance of at least 40 angstroms and may be hydrophilic; or 
may comprise amino acids which together assume an unstructured polypeptide configuration in 
aqueous solution; or 

may be is cysteine-free; or 
5 may comprise a plurality of glycine or alanine residues; or 

may comprise plural consecutive copies of an amino acid sequence; or 

may comprise one or a pair of amino acid sequences recognizable by a site specific cleavage agent. 

The antibody binding site preferably binds with said antigenic determinant with a specificity at least 
substantially identical to the binding specificity of said immunoglobulin molecule; or 
10 may bind said antigenic determinant with an affinity of at least about 10* M _1 ; or 

may bind said antigenic determinant with an affinity no less than about two orders of magnitude less 
than the binding affinity of said immunoglobulin molecule. 

The protein of the invention may further comprise a polypeptide spacer incorporated therein interposed 
between said antibody binding site and said polypeptide. In this case, the polypeptide spacer may 
75 comprise amino acids selectively susceptible to cleavage; or 

be hydrophilic; or 

may comprise amino acids which together assume an unstructured polypeptide configuration in 
aqueous solution; or may be cysteine-free; or may comprise a plurality of glycine or alanine residues; or 
may comprise plural consecutive copies of an amino acid sequence. 

20 The effector protein may be an enzyme, toxin, receptor, binding site, biosynthetic antibody binding site, 
growth factor, cell-differentiation factor, lymphokine, cytokine, hormone, or anti-metabolite. The sequence 
capable of sequestering an ion may be calmodulin, metallothionein, a fragment thereof, or an amino acid 
sequence rich in at least one of glutamic acid, aspartic acid, lysine, and arginine. 

The polypeptide sequence capable of selective binding to a solid support may be positively or 

25 negatively charged amino acid sequence, a cysteine-containing amino acid sequence, streptavidin, or a 
fragment of protein A. 

The protein of the invention may comprise a plurality of biosynthetic antibody binding sites; or 
an additional biofunctional domain. 

The invention also covers a DNA encoding the protein of claim 1, or a host cell harboring and capable 
30 of expressing said DNA. 

The invention also covers a biosynthetic binding protein expressed from DNA derived by recombinant 
techniques 

said binding protein comprising a single polypeptide chain comprising at least two polypeptide domains 
connected by a polypeptide linker, the amino acid sequence of each of said polypeptide domains 
35 comprising a set of CDRs interposed between a set of FRs. each of which is respectively homologous with 
at least a portion of CDRs and FRs from an immunoglobulin molecule, 

at least one of said domains comprising a said set of CDR amino acid sequences homologous to a 
portion of the CDR amino acid sequences of a first immunoglobulin molecule, and a set of FR amino acid 
sequences homologous to a portion of the FR sequences of a second, distinct immunoglobulin molecule. 
40 said polypeptide domains together defining a hybrid synthetic binding site having specificity for a 
preselected antigen. 

In this latter aspect, the domains may comprise FRs homologous to a portion of the FRs of a human 
immunoglobulin; or 

said polypeptide domains may be peptide bonded to a biologically active amino acid sequence. 
45 Moreover, the binding protein may further comprise a radioactive atom bound to said binding protein. 

The invention also embraces a DNA encoding the binding protein of claim 17, or a host cell harboring 
and capable of expressing said DNA. 

Also contemplated is a biosynthetic binding protein expressed from DNA derived by recombinant 
techniques. 

so said binding protein comprising a single polypeptide chain comprising at least two polypeptide domains 
connected by a polypeptide linker, the amino acid sequence of each of said polypeptide domains 
comprising a set of CDRs interposed between a set of FRs, each of which is respectively homologous with 
at least a portion of CDRs and FRs from an immunoglobulin molecule, 

said polypeptide linker comprising plural, peptide-bonded amino acids defining a polypeptide of a 

55 length sufficient to span the distance between the C-terminal end of one of said domains and the N-terminal 
end of the other of said domains when said binding protein assumes a conformation suitable for binding, 
and comprising hydrophilic amino acids which together assume an unstructured polypeptide configuration 
in aqueous solution. 
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said binding protein being capable of binding to a preselected antigenic site, determined by the 
collective tertiary structure of said sets of CDRs held in proper conformation by said sets of FRs and said 
linker when disposed in aqueous solution. 

According to this aspect of the invention, said polypeptide linker may span a distance of at least about 
5 40A when said binding protein is disposed in aqueous solution in a conformation suitable for binding said 
preselected antigen; or may comprise a plurality of glycine or alanine residues; or may comprise plural 
consecutive copies of an amino acid sequence; or may comprise (Gly-Gly-Gly-Gly-Ser>3. 

At least one of the domains may comprise a set of CDRs homologous to a portion of the CDRs in a first 
immunoglobulin and a set of FRs homologous to a portion of the FRs of a second, distinct, human 
w immunoglobulin; or 

at least one of said polypeptide domains may be peptide bonded to a biologically active amino acid 
sequence. 

The invention also covers a biosynthetic binding protein expressed from DNA derived by recombinant 
techniques, 

75 said binding protein comprising a single polypeptide chain comprising at least two polypeptide domains 
connected by a polypeptide linker, the amino acid sequence of each of said polypeptide domains 
comprising a set of CDRs interposed between a set of FRs, each of which are respectively homologous 
with at least a portion of CDRs and FRs from an immunoglobulin molecule, 

said binding protein being capable of binding to a preselected antigenic determinant, determined by the 
20 collective tertiary structure of said sets of CDRs held in proper conformation by said sets of FRs when 
disposed in aqueous solution, with a binding specificity at least substantially identical to the binding 
specificity of said immunoglobulin molecule comprising said homologous CDRs. 

Also covered is a biosynthetic binding protein expressed from DNA derived by recombinant techniques, 
said binding protein comprising a single polypeptide chain comprising at least two polypeptide domains 
25 connected by a polypeptide linker, the amino acid sequence of each of said polypeptide domains 
comprising a set of CDRs interposed between a set of FRs, each of which are respectively homologous 
with at least a portion of CDRs and FRs from an immunoglobulin molecule, 

said binding protein being capable of binding to a preselected antigenic determinant, determined by the 
collective tertiary structure of said sets of CDRs held in proper information by said sets of FRs when 
30 disposed in aqueous solution, with a binding affinity at least 10 6 M" 1 . 

This binding protein may have a binding affinity at least about 10* M _1 ; or 

no less than two orders of magnitude less than the binding affinity of said immunoglobulin molecule 
comprising said homologous CDRs. Also at least one of said polypeptide domains may be peptide bonded 
to a biologically active amino acid sequence. 
35 The binding protein of the invention may further comprise a radioactive atom bound to said polypeptide 
chain. 

Claims 

40 1. A biosynthetic single chain polypeptide comprising a linking sequence connecting first and second non- 
naturally peptide-bonded, biologically active polypeptide domains to form a single polypeptide chain 
comprising at least two biologically active domains, connected by said linking sequence, said linking 
sequence comprising hydrophilic, peptide-bonded amino acids comprising at least 10 amino acid 
residues, said linking sequence being cysteine-free, having a flexible unstructured polypeptide configu- 

45 ration essentially free of secondary structure in aqueous solution, having a plurality of glycine or serine 
residues and defining a polypeptide of a length sufficient to span the distance between the C-terminal 
end of the first domain and the N-terminal end of the second domain. 

2. The biosynthetic polypeptide of claim 2 wherein said linking sequence comprises threonine. 

50 

3. The biosynthetic polypeptide of claim 1 or claim 2 further comprising said first domain connected by a 
peptide bond to said N-terminal end of said linking sequence and a second domain connected by a 
peptide bond to the C-terminal end of said linking sequence. 

55 4. The biosynthetic polypeptide of claim 1 wherein said linking sequence comprises plural consecutive 
copies of an amino acid sequence. 
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5. The biosynthetic polypeptide of claim 4 comprising the amino acid sequence GlyGlyGlyGlySer- 
GlyGlyGlyGlySerGlyGlyGlyGlySer. 

6. The biosynthetic polypeptide of claim 1 wherein said linking sequence comprises one or a pair of 
5 amino acid sequences recognizable by a site specific cleavage agent. 

7. A DNA encoding the biosynthetic polypeptide of any of claims 1-6. 

a A biosynthetic linker comprising a polypeptide linking two non-naturaJly linked polypeptide domains to 
10 form a multifunctional protein, said linker comprising plural, hydrophilic, peptide-bonded amino acids 
and which define a polypeptide of a length sufficient to span the distance between the C-terminal end 
of a first said domain and the N-terminal end of a second said domain, wherein each said domain 
comprises a biologically active polypeptide and has a conformation suitable for biological activity 
independent of the biological activity of the other domain. 

75 

9. A biosynthetic linker comprising a polypeptide linking two non-naturally linked polypeptide domains to 
form a functional protein, said linker comprising plural, hydrophilic, peptide-bonded amino acids and 
which define a polypeptide of a length sufficient to span the distance between the C-terminal end of a 
first said domain and the N-terminal end of a second said domain, wherein said domains together 

20 comprise an immunologically reactive binding site specific for a preselected antigen. 

10. The biosynthetic linker of claim 9 wherein said two domains mimic a VH and VL from a natural 
immunoglobulin. 

25 11. The biosynthetic linker of claim 8 or 9 which 

(a) comprises threonine, or 

(b) is cysteine-free, or 

(c) comprises a plurality of glycine or serine residues, or 

(d) comprises plural consecutive copies of an amino acid sequence, or 
30 (e) spans a distance of at least 40 angstroms, or 

(f) comprises the amino acid sequence GlyGly GlyGlySerGlyGlyGlyGlySerGIyGlyGlyGlySer, or 

(g) comprises one or a pair of amino acid sequences recognizable by a site specific cleavage agent. 

12. The biosynthetic linker of claim 8 wherein at least one of said domains comprises an enzyme, a toxin, a 
35 receptor, a binding site, a biosynthetic antibody binding site, a growth factor, a cell-differentiation factor, 

a lymphokine, a cytokine, a hormone, a remotely detectable moiety or an anti-metabolite. 

13. The biosynthetic linker of claim 8 wherein said first domain comprises a single chain binding site and 
said second domain comprises an enzyme, a toxin, a receptor, a binding site, a biosynthetic antibody 

AO binding site, a growth factor, a cell-differentiation factor, a lymphokine, a cytokine, a hormone, or an 
anti-metabolite. 

14. The biosynthetic linker of claim 8 wherein at least one of said domains comprises a polypeptide 
capable of sequestering an ion. 

45 

15. The biosynthetic linker of claim 14 wherein said polypeptide comprises calmodulin, methaJlothionein, a 
fragment thereof, or an amino acid sequence rich in at least one of glutamic acid, aspartic acid, lysine, 
and arginine. 

so 16. A DNA encoding the biosynthetic linker of claim 8 or 9. 

17. A host cell transformed with and capable of expressing the DNA of claim 1 6. 

18. the biosynthetic linker of claim 8 or 9 wherein the amino acids of said linker together assume an 
55 unstructured polypeptide configuration in aqueous solution. 
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'* 20 SO «o SO 60 to 

CAA7TCCAAC7TCAACTCCACCACTCTCCTCC7CAA7TCC77AAACC7CCCCCCTC7C7CCCCA7C7CC7 
CluPh«0l»\r«l01iiL«ttCanCXn$trClFrro01uUuirtUrsFroClFAlaStrV*XArtHtts # rC 

*»■•« Aha 1 1 HhaX 

teoll fou*HI 3aii96I Stat HlnPX 

Ta«X PatX EeolIX KatllXaXXX 

Null Papl 
Rhi! 
NIdPX 
Marl 
MXaXV 
SerPX 
Acyl 

SO 90 100 110 120 130 110 

CCAAATCCTCTCCOTACATTTTC ACCCACTTCTACATCAATTCCCTTCCCCACTCTCATCCTA ACTCTCT 
yaLyaStrStrClyXyr lit PheThr AspPht TyrHtt Aan Trptfal Ar«ClftS>t»Ht aCl vLv« ^ 
laal Hphl NXalll Batxi Nlalll Xo» 

Ma 

'SO 160 170 100 190 200 210 

ACACTACATCCCC7ACA7TTCCCCA7ACT:TCCCGTTACCCCC7ACAACCACAACTTTAA1CCTAAOOCC 
uAapTyr II tCl y Tyr HtSer Pro7y r St r CI y 1 7hr CI yTyr ksnG\ n Lys Phe LyaOl vLt»A1 a 
I Rati BatEH Dril 

•X Hpall 

NatXXX 

220 230 2«0 250 260 270 280 

ACCC77AC7C7CCACAAA7C77CC7C AAC7GC77ACA7CGAGC7CCC77C777GACC7C7GAGGAC7CCC 
7hrLtu7hrValAapLysStrStrStr7hrAlaTyrMttCluLtuAriStrltu7hr5trCluAapSarA 
Aft cl MboII Alul DdtZ HlnfXrn 

Hindi KlalUBbvI Sae 

FauaHI 

TaqI 

290 300 310 320 330 340 330 

CCCTA7ACTA77CCCCCCCC7CC7C7CC7AACAAA7CCCCCA7CCA77ACTCCCC7CA7CCCCCC7CTC7 
layalTyr7yrCyaAlaCXyStr StrClyA>nLya7rpAlaMtt AattTvrrrt>ci y Hi>civAi>^Fif a 
uDXX HhalBaoIX NaaXXX MaallX *ha!I Ha 

XXAecI Fau Oil «oo I 

HlnrXNXalV NXaXXI 

Sau96I 
Sty I 

360 3T0 
TACTC7A7CC7CA7A0CA7CC 
lThrVal5trSar«aaAap 
tltX IttHi 
VXalV 
Sau3A 
XbeXX 




FI&. * k 



27 



EP 0 623 679 A1 



10 20 30* *0 SO 60 70 

GAAT7CGACGTCC7AATCACCCACAC7CCCCTG7C7CTGCCCGTTTC7C7GG67GACCAGCC77C7A77T 
GXuPhtAapValValH«tThrGlnThrProl.auStrL*uProVal3tr LauGlyAapGinAiaSar ZltS 
CcoHI A«tZI Hlnfl RpaZZ BatEXZ 

AhtlX HpAX EcoRZZ 

TaqI Scrfl 
AcyZ KaaZZZ 
Haall 

80 90 100 110 . 120 130 1 tO 

CTTGCCCC7C77CCCAGTC7C7CG7CCA77C7AA7GG7AACACT7ACC7CAAC7CG7ACC7CCAAAAGGC 
trCya ArgStrStrCl nStrleuVilHl aStr* and y A snThrTyr ttuAanTrpTyrtauClnLyAl 
-Fnu«HI Avail HatiZX HflEII Ban! 

MboIZ BatXX Kpnl 

Sau96X fflTTv 

Raal 

150 160 170 180 190 200 210 

7CGTCACTC7CCGAACC77C7CA7C7ACAAAG7C7C7AACCGC77C7C7GG7C7CCCGCA7CG777C7C7 
aClyGlnStrProLyattuf uXlt7yr ly»yalSeraanargPhtST GlyValFroAapAriP!>aStr 
AluZ Sau3A HfcaZZ 
HlndZZX HciIS*u3A 

Scrfl 

220 230 290 250 260 270 280 

GG77CT0CTTCTCC7AC7GAC7TCACCC7GAAGA7C7C7CC7C7CCAGCCCCAGCA7C7GCG7A7C7AC7 
GlyStrClyStPGiyThrAapPhtThrLauLyalltSarAriValGluAlaCluAapLauClyllaTypP 
Raal HphI BflZZ 7aqIMa«XZZ Sau3A 

HboII XhoZZ 
SauSA 
XhoXX 

290 300 310 320 330 3*0 330 

TCTCC7C7CAGACTAC7CA7G7ACCGCCGACCTTCGGCCGTGGCACC AAGCTCCAGATCAAACCT7CAGGATCC 
^•CyaSTGln ThrThrMlayalProPro ThrPhaGXyGlyGlyThrlyaLtuGluZltLyaArt^op 

Odal Mlalll HfXtXZ BanX AluZ Sau3A HaaZZ BatHZ 

Raal MXftX? Aval RlalV 

TaqI Sau3A 
Xhol XhoZZ 



FlGn. H6 
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10 20 30 *0 90 60 70 

CAA77CCAACT7CAAC7GCAGCAGTCTGGTCG7GAATTCGTTAAACCTGGCCCCTCTC7CCGCA7CTCCT 
CluPhtCluVaXClnL«uClnClnS«r01yffoClut«uV»lLy»Prp01yAl4S«rViXArtMttStrC 

AauXX BtvX AvalX Aha I I Hhal 

CcoftI rnutpx Sa«96I BanX HinPI 

T*ql PatX CcoBXX HatlHlalll 

HaalX ftp! 
Hhal 
HlnfX 
Marl 
NlaXV 
Acyl 

80 90 100 110 120 130 110 

CCAAATCC7CTCCCTACATTTTCA CCAATTACTACATCCATTCCCTTCCCCACTCTCA TCCTAA0TCTCT 
C l T C 7 AAAAC7C&TTAATeAT5TiCCYAAtetA TrogSYg — 
yaLysStrStrClyTyrllePhtThrAanryrTyrlltMiaTrpvaiArfClnStrHlsClyLyaSepLe 
ftaal HphX Fokl BatXI Nlalll Xba 

Ha 

150 160 170 180 190 200 210 

AGAC7AC A7CGGG7GGA7C7ACCCCGG7A A7GG7A AC AC7AAGTAC7-ACAA7GAGA AC77T AAAGG7AAG 

7GATG7C7CCCACC7AGA7GCGCCCA77ACCA77G7G A77CATGA7G77AC7C77GA AA 
uAapf yriltGly7rpXl«TyrPP0GlyAsnGlyAanTnrLyaTypTyrA3nGljAanPntLyaGlyLy3 
I Sau3A Aval MaallXDdelAaal Oral 

•I Xholl Hpall Seal 

tfcll 

Nell 
Sail 
Xaal 

220 230 240 2S0 260 270 260 

GCCACCC7TACTG7CGACAAA7C7TCC7CAAC7GC7TACATGGAGC7GCG7TCTTTGACC7C7GACCAC7 
AIa7hrLtu7hrV*XAspLyaSepStrStPThrAla7yPHttGluLtuAP|StrtauThrStrGXuAapS 
Aeel Mboll Alul Ddal Mlnf 

Hindi NlaXIXBtvI 
Sail FnuaHI 
TaqX 

290 300 310 320 330 3*0 350 

CCGCGG7ATAC7A7TGCGCCGGC7CCTCTCCTAACAAAT CGGCCT7CGA7TAC7GGGGTCA7GG CCCC7C 

GGAAGC7AA7GACCCC AG7ACCGC 
trAlaValTyrTyrCyaAlaGiySarSarClyAanLyaTrpAlaPheAapTyrTrpGlyHiaGlyAlaSa 
X Aeel HhalBanll HaaXXX HaaXXX AhaXX 

FnuDU FnuOXX Sau96I7aqI BanX 

SaelX HlnPIHlalV ~ — Haall 

Hhal 
HlfiFX 

360 370 *££l 

7CT7AC7GTATCC7CA7AGGA7CC NlalXI 
rVal7hrYalStrSer«a© HXalV 
HaaXXX BaaH 1 Acyl 

Xholl 
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10 20 30 40 SO 60 TO 

C4ATTCCACCTCCT1ITCACCCACACTCCCCTCTCTCTCCCC0TTTCTCTC0CTWCCACCCTTCTATTT 
GluPhaAtpvamiMat7hrGlaThrFroLauSarLattProValSarLauGXyAap0lDAlASarXiaS 
teoftX AatXX UtnfX Hptll BattXX 

A&AXX tfpnl SeoftXX 

T*ql SerPX 
Acyl HaaXIX 
NAtXX 

eo 90 100 110 120 130 140 

CTTCCCCCTCTTCCCAG7CTA7T0T0C ACTCTAAT6GTAACAC7TACCTC0ATTCGTAC CTCCAAAAGCC 

AACGCGGAGAAGGCTCACA7AACACG7GAGAT7ACCA77G7GAA70GACCTAAC 
trCyaAriSerSerGlnStrlltvaiHUStr AjngiyAinrnrryrLtuAflpTrpTyrLtuGlnLyiAl 
Fnu*HI HftAI HaaXIX EeoRXX BanI 

H»olI Scrfl Kpnl 

Hfitix Italy 

Raal 

150 160 170 160 190 200 210 

TCCTCAGTCTCCCAACCTTCTCATCTACAA AG7CTCTAACCCCTTCTCTGGTCTCCCGGATCCTTTCTCT 
aGIyClnStrProLyaltuLauIl«7yrLyaValSarAanArfPhaSarGXyValProAapArtPhaSar 
AluX S*u3A Mpall 
HlndXXX VelXSau3A 

SerfX 

220 230 2*0 250 260 270 260 

GGTTCTGC7TC7GCTACTCACTTCACCCTGAAGATCTCTCGTG7CCAGG CCGAGG ATCTCCCT ATCTA CT 

CCCTCCTaCaCCSAYaCaTCA 

GlySarGlySarCly7hrAspPfta7hrLauLyaIlaSarArt;ValCluAlaGluAaplauGl7llaTyrT 
Raal HpnX BfXXX Taql HaaXXI Sau3A 

Hftoll moil 
SAU3A 
ZhoXX 

290 300 310 320 330 3*0 350 

^ T ^ TT ^*9ggg TCT ^ T g TACCCTGC j CCTTCCCCCCTGCCACCAACC ^CGAGATCAAACGTTCAGGATCC 
TGACGA AGGTCCCCAGA CTACATGGC AdjctflCAAGCCCCC ACCG7GGTTCGAGCT 
yrCyaPhaGlnGlyStrHlaValPro7rp7hrPb0GlyGIyGXy7hrLyaL«uCluXlaLyiAr(«op 

EcoRXI JfXaXXX Avail BanX AluX Sau3A MaaXX BaaHX 

Sarrx Raal Sau96X NXaXV Aval MlalV 

HtJIXt Taql 3au3A 

XhoX XhoII 



F»G». HD 
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10 20 30 to 50 60 70 

CAATTCATCCAACTACAACTCCAAC AATCTGCCCCCCCTCTCCTACCTCCCTCTCACACTCTCTCCCTCA 
ClufhtH«tCXuf»lClnU*uCloCXna«rClyfroClrttuValArtfroS#rClnThrL»uS«rL«uT 
EeolXUaXXX laal ApaXHpalX BaaX OtfalHiaft 

linn MaaXX TthlUX 

aetxxx 

■•IX 
llatv 
3au96X 
Say 96 X 

SorrX 

60 90 100 110 120 «30 160 

CTTCTACCCTATCCCCATCCACCTTCTCTAACTACTACATCCATTCCCTCCCTCAACC CCCCCCTCCTCC 
hrCyaThr ValStrGlySarThr PhtStr Atn Tyr Tyr tltHl s Trp tfal Ar K Cln ProProGl yArtGl 
Raal BaaHl Fokl AviII HlnclI HpaXX 

Hpall KXalV Nell 

NlaXV Sau96I SerFI 

Sau3A 
XhoXX 

150 160 170 160 190 200 210 

TCTCCAGTGGATCCGTTGGATTTACCCCGGTAATGCTA AC ACT A ACT ACTAC A ATC A CA ACTTTAA AGCC 
y ttuCluTrp UtCly Trp Il«Tyr ProClyAsnCl y AsnThr LyaTyrTyr* snCIuAan PhcLyaGXy 
Aval Sau3A Aval Hat It XDdc IRsa I DraX N 

Taqt Hpalt 5eaX Sp 

Xhol Kelt 

Hell 
SerFI 

Serf I 
SaaX 
Xaal 

220 230 260 250 260 270 280 

ATCCTCCTCCACACTTCTAACA ACC AATTCTCTCTCCCTCTCTCTTCTCTTACCCCCCCTCATACTCCTC 
Me t L«u VaX AapThr Sar Ly a A >nGXnPhtStrlauAr|Ltu St r St r VaXThr AXaAlaAspThr AXaV 
XaXXX Aeel DdtlXonI H|al HboIX MetlIIFnu«HX 

hi HlncII BbvIX FnuDXX 

SaXX SaeXt 

TaqX 

290 300 310 320 330 3*0 350 

TCTACTACTCCCCCCCTTCCTCCCCTAATAACTCCCCATTTCATTACTCCCCCC ACCCCTCTCTCCTCAC 
alTyrTyrCraAXaart StrStrCXyAanlysTrpAIafhtAapTyrTrpGly GInGIyStrlouValTh 
Raal BatHII Hp. II NlaXV BanXI BatEII 

fauOII CeoBIX MphI 

rauOXX HatlXX NaaXXI 

HaaX Sau96X 

Hhal SerFI 
MlnPX 

360 370 
CCTATCCTCTTAACTCCAG 
rValStrStr'oeLtuGXn 
rati 
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10 20 30 «0 SO 60 ?0 

CAA77CATCCAA7C7GT7CTCAC7CAGCCCCCG7C7C7A7C7CC7GCACCCGC7CAACGCC7AACTATC7 
GlttfhtHttCluS«rV«lL«ttThrClnPreFroStrValS«rClyAlaProGlyGlnAr$ValThrtX«S 
Ceoll atnft OdtXFnutHX H|lAtHpatX FnyOXX 

RXaXXX HlafX MollHlncXI HaalXX 

Xaal Serf I HluX 

60 90 100 110 120 130 ItO 

CTTCCCCTTCCTCTCAOTCTATTCTCCATTCTAATCCCAACACT7ATCTCCAATCCTACCAACAACT0CC 
trCya Ar«3trStrGlnStr IltValHla3«rAsnGlyAan7hr7yr LeuGl u 7rpTyrCl nGln Lau Pr 
Odal datll Bant Hp 

Kpnl Mo 
NlalV Se 
BaaX 

ISO 160 170 180 190 200 210 

CCCCACCCCCCCCAACCTCC7CATCTTTA AAC7ATCTAATCCCT7CTCTCCCC7ACCCCATCCA77CTCT 
oClyThrAAafrotysLtuLtu Iltfhe lya V«l StrAsnAr^Ph^StrG] y Vil ProAapArtPhaStr 
all FnuOt: Alul Oral ftaal Clal 

•TT Ml 



it Hhal BbvX Saw3A HpaTl HlafX 

rFt Hln PI FnuaHl Sau3* 

Baal TaqI 

NXaXV 

220 230 290 250 260 270 2B0 

C7A7C7AAG7C7CCC7CC7C7CCCAC7C7GGCGA7C AC7GG7C7GCAAGCAGA ACATGAGCCCGA77AC7 
ValS.r LyaSarGl ySar Bar AlaThr Lau Ala XI tThrGlyLauGlnAlaClMAapGluAlaAapTyrT 
Ddal NlaXV BglX BauJA HboXI HaaXIX 

290 300 310 320 330 3«0 350 

AC7G7777CAAGGC7C7C A7G7ACCC7GGACC77C GG7GG7GGCACCA AGC77AC7C7AC7GCG7CAGCC 
yrCya Pht01nGlySarHlaValPro7rp7nrPatGly QiyGly7hrLyalauThrVallauAriGAaPr 
MXaTTi Avail Ban I Alul laaX Hgal 



360 



BaaX Bau96X Mia IV HlndlTI 

HfUIX 



GTAAC7GCAG rjr dC 

o*oel»auGla r 1 w - 1 

PatX 
HaaXXX 
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XO 20 30 40 SO §0 70 

CAAGTrCAACTGCACCAGTCTCOTCCTCAATTCCTTAAACCTGCCCC^ 
CV0LQ0S6PCLVXPCASVX HSCKSS 
Av«XX Ah«XX Mh»: ItaXX* 

FmMKX SauMX iinXMnXX^ _ M4 »" 

Mtx ccoxxx r»px»xxxxx 

KatXX MtpHX 
Kh*X 
MiaFX 

n%zi 

NXiXV 

scrrx 

| tS $5 105 115 I 125 135 X45 

CCCTACCCCCAC7CTCATCCTAACTCTCTACACrrTAAACC7AACCCCACCCrTACTSTCCACAAATCTTCC?CA 
GYROS M CKSLOPKCKATLTVDKSSS 
San! Bitxi NlaXXX UAi ** oX .. WboI I*,, 

Mil 

XS0 170 XB0 190 300 (U0 | 320 

ACTCCTTACATGGAGCTCC Gnc : I :GACCTC7GACGACTCCCCGGTATACTATTGCCCGCGTATCGATTA7TGC 
TAYMCLRSLTSEDSAVYYCARXDY W 
AXuX DdtX HlnfX AeeX ACCXI QH HI 

NXiXXXBbvX- MnlX^MnXX- AccXX AccXX TaqX S 

FAU4KX NftpBXX ftU&XI 

SacXX Hhal 

Hhal 
HUPX 
HiftPX 



PR*4 

23$ 245 25$ 31$ 

6CCCATCGCGCTAGCCTTACC6TGAGC7CCTAAGGATCC f /€. «$ 

GMGASVTVSS*GS 
•XV KatXX AXuX DdtXSamHI 

itttCX HhaX SanXXMatXXNlaXV 
HaaXXX HiftPX ISPX2SI S3U3A 

neex Hhal m«1ax Xhexx 

nuxxx socx 
StyX 
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10 20 30 40 50 60 70 

GAATTCATGGCTGACJUtCJUUTrCAACAAGCAA&^ 

EFMADNKFNKCQQNAFYEILHLPN L 
CcoRI MlttZ BglXI BspKZ* 

xml 

•3 »5 103 113 123 133 143 

AKCCAACAGCAGCGTAACGCCTTCATCCAAAGCTltUAAGACCACCOT 
HEEQRNGFIQS LKDDP6QSANLLAE 

HlndlXI BapMI* 

2C047ZZZ 

160 170 180 190 200 210 220 

GCCAAGAAACTGAACGACGCTCAGGCCCCCAAGAGTGATCCCGAAGTTCAACTCCAGCA^ 
AKKLNDAQAPXSDPEVQLQQSGPE L 
Karl PstZ 

235 245 255 265 275 265 295 

CTTAAACCTCGCCCCTCTCTCCCCATCTCCTGCAAATCCTCTGGCTACATTTrCACCCACTTCTACATCAATTCC 
VK P G ASVRM S C K S S G Y Z FTDFYMN W 
NarZ Fapl 

310 320 330 340 350 360 370 

GTTCGCCAGTCTCATGGTAAGTCTCTAGACTACATCGGGTACATTTCCCCATACTCTGGGGTTACCGCCTACAAC 
VRQSHCKSLOYIGYXSPYSGVTGYN 
BatXZ Xbal PflMI BstEXX 

385 395 405 415 425 435 445 

CAG AAG TTTAAAGGT AAG G CG AC CCTT ACTGT CG A C AAATCTTCCTCAACTG CTT ACATGG A GCTG CGTTCTTTG 
QKFKGKATLTVDKSSS TAYMELRS L 
Oral Sail 

460 470 480 490 500 510 520 

ACCTCTCAGCACTCCGCCGTATACTATTGCGCGGGCTCCTCTGCTAACAAATGGGCCATCCATTATTGGGCTCAT 
TSEDSAVYYCAG5SGHKWAMDYWGH 
SacIX HeoZ 

535 545 555 565 575 585 595 

GGTGCTAGCGTTACTGTGACCTCTGGTGGCCCTGGCTCCGCCGCTGGT6GCTCCGGTGCCGGCCGATCCGACGTC 
GASVTVSSGGGGSGGGGSGGGG5DV 
Nhal SacZ BaaHZ AatZZ 

610 620 630 640 650 660 670 

GTTGTTACCCAGACTCCGCTGTCTCTCCCGGTTTCTCTGGGTGACCAGGCTrCTATTTCTTGCCGCTCTTCCCAG 
VVTQTPLSLPVSLGDQASISCRSSQ 

BatEIX PflM 

«85 695 705 715 725 735 745 

TCTCTGGTCCATTCTAATGGTAACACTTACCTGAACTGGTACCTGCAAAAGGCTGGTCAGTCTCCGAAGCTTCTG 

SLVHSHGNTYLHHYLQKAGQSPKLL 
I B»tXX BspMX+ HlndZZZ 

XpnZ 



FIG. 6A-1 
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t 1 1 ■ 1 r 




.00001 .0001 .001 JOI .10 1.0 10 100 

M « Of digoxik binding protein per mi 
Fife. 1A 




845 855 865 875 885 895 

SCCGAAGACCTCCCTATCTA C T T CTCC TCT CAGACTACTCATCTXCCGCCGACT 
L K I S RVZAEDLC ZYFCS QTTBVPPT 
Bglll 

910 920 930 940 

TTTGCTGGTGGCACCAACCTCGACATTAAACCTTAACTGCAG 
rCGCTKLEXKR* 

Xhol HpaZ PstI 

FIG. 6AO 
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10 20 30 40 50 40 

GATCCTBACGTCQTAATOACCCAGACTCCGCTOTCTCTGCCQSTTTCTCTCGGTSACCAC 
DPDVVH7Q7PLBLPVBLBD0 

70 60 90 100 110 120 

BCTTCTATTTCTTGCCGCTCTTCCCAGTCTCTGGTCCATTCTAATGOTAACACTTACCTO 
A B X BCftBSQBLVH8-NQN7YL 

pfim BstXI 

130 140 150 140 170 ISO 

AACTGGTACCTGCAAAAGGCTGGTCAOTCTCCGAAGCTTCTGATCTACAAAGTCTCTAAC 
NMYLQKABQ8PICLL1 Y K V S N 
B»P«1* Hindi XI 

Kpnl 

190 200 210 220 230 240 

CGCTTC7CTGGTG7CCCGGATCGTTTCTCTGGTTCTGGTTCTGGTACTGACTTCACCCT0 
RF8GVP ORF8G608GTDFTL 

250 240 270 280 2fO 300 

AAGATCTCTCGTGTCGAGGCCGAAGACCTG6GTATCTACTTCTGC7CTCAGACTACTCAT 
•C XSRVEAEDLG I YFCSOTTH 
BgX XX 

•1C 320 330 340 350 340 

GTACCGCCGACTTTTGGTGGTGGCACCAAGCTCGAGATTAAACGTGGA7CT56AGGTGGC 
V PF'TFGGGT KLE X KR G566G 

XhoX 

370 380 390 400 410 420 

G6ATC7GGTG6AGGTGGCTCTGGTGGCGGTGGATCCGAAGTTCAATTGCAGCAGTCTGGT 
GSGGGGSGGGGSEVO LOOSG 

BmHI 

430 440 450 440 470 480 

CCTGAATTGGTTAAACCTGGCGCCTC7GT6CGCATGTCCTGCAAATCCTCT6GGTACATT 

PELVRPGASVRnSCKSSGY X 
Narl FspX 

«*0 500 510 520 530 540 

TTCACCGACTTCTACATGAATT6GGTTC6CCAGTCTCATGGTAAGTCTCTAGACTACATC 
FTDPY«NWVROSMGlCSLDYX 

B»txi Xb«X 

550 540 570 580 590 400 

GG6TACATT7CCCCATACTCTGGGGTTACCGGCTACAACCAGAAGTTTAAAGGTAA5GCG 
6Y18PYS6VT6YN0KFKGKA 
PfXIIX B»tEXX . Dr*I - 

410 420 430 440 630 640 

ACCC77AC7G7CGACAAA7C77CC7CAAC7GC77ACA7GGAGC7GCG77C777GACC7C7 

TLTV0KSSS7AYMELRSL7S 
8*11 

670 680 490 700 710 720 

GAGGAC7CCGCGG7A7AC7A77GCGC6GGC7CC7CTGG7AACAAA7GGGCCA7GGA7TA7 
EOSAVYYCABSSGNlCWAflDY 
8«» NCOX 

730 740 750 740 Pi/ / R 

7GSGG7CA7GG76C7AGCG77AC7G7GASC7C77AAC7GCAS H t*. to " 
K G M 5 o !? V 7 V 5 £ 
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no.. 
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XO 20 30 40 50 60 

CAACTTCAACTCCAGCACTCTCCTCCTCCATTCCTTCCACCTTCCCXCACTCTCTCCCTC 
Z V Q LE Q5G PC LVRPSQTLS L 

70 SO 90 100 110 130 

ACCTGCACATCCTCTGGCTACATTTTCACCGACTTCTACATGAATTGCGTTCGCCAGCCT 

TCTSSGYXFTOFYMHMVRQP 
BapHI* B»tXX 

130 140 150 150 170 150 

CCTGCTCGGGCTCTAGACTACATCGGGTACATTTCCCCATACTCTGGCGTTACCGCCTAC 
PGR G L D Y X G Y X S P YSGVTG Y 
Xbal PflMX BstZXX 

190 200 310 330 330 340 

AACCAGAACTTTAAAGGTAAGGCGACCCTTCTGGTCAACAAATCTAAGAACCAGGCTTCC 
NQKFKG KATLLVNKSKNQAS 

DraX 

250 260 370 250 290 ^00 

CTG CGG CTGTCTTCTGTGACCG CTG CGGACACCGCGGTATACTATTCCGCGGG CTCCTCT 
IRL55VT AADTAVYYCAGSS 

SacXX 

310 320 330 340 350 350 

GGTAACAAATCGGCCATGGATTATTGGCCTCAGCGTrCTCTGGTTACTGTGAGCTCTGGT 
CNKWAMDYWGQCSLVTVSSG 
NcoX S«CI 

370 380 350 400 410 420 

GCCGGTGGGTCGGGCGGTGGTGGCTCGGGTGGCGGCGGATCCGACGTCGTTATGACCCAG 
GGGSGGGGSGGGGSDVVMTQ 

BamHI AatXX 

430 440 450 460 470 450 

CCTCCCTCGCTTTCCGGGGCTCCTGCTCAGCGGGTTACTATTTCTTCCCCCTCTTCCCAG 
PPSVSGAPGQRVTXSCRSS Q 

PflM 

490 500 510 520 530 540 

TCTCTGGTCCATTCTAATGGTAACACTTACCTGAACTGCTACCAGCAACTGCCTGGTACG 

SLVHSNGNTYLMWYQQLPGT 
X BatXX *P»* 

330 560 570 580 590 60 0 

CCTCCGAAGCTTCTGATCTACAAACTCTCTAACCGCTTCTCTCGTGTCCCCGATCGTTTC 
APKLLIYKVSNR PS G V P D R F 
HlndXXX 

610 620 630 640 650 660 

TCTGCTTCT G CTTCT6GTACTGACTTCACCCTGGCGATCACTCGTCTCCAGGCCGAAGAC 
SGSCSGTDFTLAITGLQAED 

670 680 690 700 710 720 

GAGCCTGACTACTTCTGCTCTCAGACTACTCATGTACCGCCGACTTTTGGTGGTGGCACC 
EA DY FCSQTTHVPPTFG G GT 

730 740 750 Q 

AAGCTCACGGTTCTGCGTTAACTGCAG C I (ji I A 

KLTVLR* LQ 
Hpal PstI 
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10 20 30 40 50 to 

GAATTCC AACTTCAACTCCAGCACTCTCGTCCTCA A TT GG TT A AACCTG6CC C CTCT C T G 
t T t V Q L QQS C P E L V . X P G A S V 

AtuiZ Pitt Marl tm 

EcoRZ 

70 00 90 100 110 130 

CGCATCTCCTG CAAATCCTCTGGCTACACCTTCACCJUtCTATTACATCCACTCGCTTAAC 

RMS CXS SCYTFTHYYZHBLX 
pi AflZZ 

130 J40 150 1M 170 100 

CAGTCTCATGGTAAG rCTCT AGAGTGGATCCU I'l UGATTTACC C CGCTAATGCTAACACT 
QSHGXSLEWXGWXYPCNCXT 
XbaZ ful 

190 200 210 220 220 240 

AAGTACAATGAGAACTTTAAAGG7AACGCGACCCTTACTGTCCACAXATC7TCCTCAACT 
KYNENFXGXATLTVDKSSST 
Oral Sail 

250 260 270 290 290 300 

CCTTACATCGACCTGCGTTCTTTGACCTCTGAGGACTCCGCGGTATACTATTGCGCGCCT 
AYMELRSLTSEDSAVYYCAR 

SacZX BsaHZZ 

310 320 330 340 350 360 

TACACTCATTATTACTTCG A7TA7TGGGG CCATG G CCCTAG CGTTACCGTGAGCTCTGGT 
YTHY YFDYWGHGASVTV5SG 

Kcol Hhtl SacZ 

370 390 390 400 410 420 

GGCGCTGGCTCCGGCGGTGGTGGGTCGGGTGGCGGCGGATCCGACCTCCTTATGACCCAG 
GGGSGGGGSGGGGSDVVKTQ 

BuKZ AatZZ 

430 440 450 460 470 4S0 

ACTCCCCTGTCTCTCCCGGTTTC7CTGGGTGACCAGGCTTCTATTTC7TGCCGCTC7TCC 
T P L S LPVSLG DQAS ZSCRSS 

BstEZI 

490 500 510 520 530 540 

CAG7CTATCGTCCATTCTAATGGTAACACTTACCTGGACTGGTACCTCCAAAAGGCTGGT 
OS Z VHSNGNTY LEWYLQK AC 
RStXZ B«pMX* 

XpnZ 

550 560 570 590 590 600 

CA6TCTCCCAAC C TT CTCA TCTACAAAGTCTCTAACCGC V I I'l 11 U UI V J CCCCGATCGT 
QSPXLLZYXVSNRPSGVPDR 
HindXZZ 

610 620 630 640 650 660 

TTCTCT CG IT CUUGI ' IVIV GTACTGACTTCACCCTGAAG A T C TCT CG T G TC C ACCCCCAG 
FSGSGSCTDFTLXZSRVEAE 

BglZZ 

670 690 690 700 710 720 

GATCTGGGTATCTACTACTCCTTCCAAGGGTCTCATCTACCGTGGACTTTCCCCSGTGGG 
DLGZYYCFQCSHVPWTFGCG 

730 740 750 

ACCAAG CTCCAGATTAAACGTTAACTGCAG £ I S ft 

Xhol HpaZ PstZ 
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10 20 30 40 SO CO 

CATCCCCACCTTATCCTCCTT<UATCTCCTCCACTACTCATCC\ACCTCCTCCCTCCCTS 
DPEVMLVES GCVLMEPCGSL 

Seal Ecoo 

70 90 90 100 110 120 

AACCTCACCTCTCCTCCTACCCCCTTCACCTTCTCTCCTTACCCCATCTCTTCCCTCCCT 
XLS CAASGFTFSRYAMSWVR 
EapX NhaX PflXX 

130 140 190 160 170 190 

CAGACTCCGGAGAAG CGTCTAGAGTGGGTCGCGACGATAT C TT CI GGTGGI TCT CACAC6 
QTPEKRLEWVATISSCC9HT 
BapMXX XbaX Mrul ECORV 

190 200 210 220 230 240 

TTCCATCCACACAGTCTCAACCCTCCATTCACCATCTCTCCACACAACCCTAACAACACC 
PHPDSVXGRFTX S R D N A X N T 

XhOZ 

250 260 270 290 290 300 

TTGTACCTCCAAATGTCTTCTCTACCTACTGAAGATACTGCTATCTACTACTGTCCACCT 
L Y L C M S 5 LRSEDTAMYYCAR 
BspMX* SnaBX ApaLX 

310 320 330 340 390 360 

CCTCCACTCATCTCACTAGTTCCTGATTATGCCATGGATTATTGCGGTCATGGTGCTACC 
PPLIS LVADYAMDYWGHGA5 
SpaX Ncol NhaX 

370 390 390 400 410 420 

GTTACTCTGACCTCTCGTGCCCCTCGCTCCGGCGGTC6TGGCTCCGGTCGCCGCCGATCG 
VTVSSGGGGSGGGG5GGGGS 
SacX 

430 440 450 460 470 480 

GATATCGTTATGACTCAGTCTCATAAGTTCATGTCCACTTCTCTTGCTCACCGTGTTTCT 

DIVMTQSHXFMSTSVGDRVS 
ECORV BitHI 

490 500 510 520 530 540 

ATCACTTGTAAGCCCAGCCAGGATGTGGCTGCTGCTATCCCATGGTATCACCAGAAGCCC 
ITCXAS QDVGAAIAWYQQXP 
Pf 1MI Sma 

550 560 570 560 590 600 

GGGCAGTCTCCTAAGCTCCTGATCTACTGGCCCTCGACTCGTCATACTGCTGTCCCG6AT 

GQSPXLLIYWASTRHTCVPD 
2 Sail 

610 620 630 640 650 660 

CCTTTC ACTGGCTCCGGATCAGGTACTGATTrCACTCTGACTATTTCGAACGTTCAGTCT 
RFTGSGSGTDFTLTXSHVQ5 
BspMIZ AauXX 

670 690 690 700 710 720 

GATGACCTCGCTGATTA CTTCTG CCAGCAATATTCCGGGTAC CC TCT G A CTTTCG GT6CC 
DDLADY FCQQYSGYPLTFG.A 

Sap! XpnX Nae 



730 740 750 

GGCACTAAACTCCAGCTGAAGTAACTGCAG 

GTX LELX* 
I Xhol PstX 



RGi. °ib 
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10 20 30 40 50 60 

CATCCCCACCTTATCCTCCTTCAATCTCCTCCACTACTCATCCAACCTCCTCCCTCCCTC 
DPEV'MLVES G'GVLMEPCGS L 

seal Ecoo 

70 SO 90 100 110 120 

KLS CAASC FTFSRYAMSWVR 
EspX WhaX PflMZ 

• 130 140 ISO 160 170 ISO 

CAGACTCCCCAGAAGCCTCTACAGTCGGTCGCGACGATA TCTT CT C CT6CTTC6AACACT 
QTPEKRLEWVATX5SGGSNT 
BapMII XbaX Nrul EcoRV AauXX 

190 200 210 220 230 240 

TACTATCCAGACAGTGTGAAGGCTCGATTCACGATCTCT C CAGACAACGCTAAC AACACG 
YYPDSVKGRFTI8RDNAKHT 

XhOl 

250 260 270 260 290 300 

TTGTACCTCCAAATGTCTTCTCTACGTAGTGAAGATACTCCTATGTACTACTGTGCACGT 
LYLQMS SLRSEDTAMYYCAR 
BspMI+ SnaBI ApaLX 

310 320 330 340 350 360 

CCTCC A CTG ATCTC A CTAGTTG CTGATTATG CCATGG ATTATTGGG GTCATGGTCCTAG C 
PPLXSLVADYAMD Y W G H G A S 
Spel NcoX HhaX 

370 380 390 400 410 420 

GTTACTGTGAGCTCTGGTGGCGGTGGGTCGGCCGGTGGTGCCTCGGGTGGCCCCGGATCG 
V TV S5GGGGSGGGGSGGGGS 

SaeX 

430 440 450 460 470 480 

CATATCCTTATGACTCACTCTCATAAGTTCATGTCCACTTCTGTTCGTGACCCTCTTTCT 

D I VMT Q S H X FMS TSVG DR V S 
EcoRV BstEXX 

490 500 510 520 530 540 

ATCACTTGTAAGGCCAGCCAGGATGTGGGTGCTCCTATCGCATGGTATCAGCAGAAGCCC 
X TCKAS QDVGAAI AWYQQ K P 
PflMX Sma 

550 560 570 580 590 600 

GGGCAGTCTCCTAAGCTGCTGATCTACTGGGCCTCGACTCGTCATACTGGTCTCCCGGAT 

GQS PKLLX YWASTRHTGVPD 
X Sail 



610 620 630 640 650 660 

CCTTTCACTGGCTCCGGATCAGGTACTGATTTCACTCTGACTATTT CG AACGTTCAGTCT 
RFTSSGSGTDFTLTISNVQS 
BspMXX AsuXX 

670 680 690 700 710 720 

GATGACCTGGCTGATTACTTCTGCCAGCAATATTCCGGGTACCCTCTCACTTTCGGTGCC 
DDLADYFCQQYSGYPLTFGA 

SspX XpnX Naa 

730 740 750 r-i r ^£ 

CGCACTAAACTCGAGCTGAAGTAACTGCAC r • ^* 

GTKLELK* 
X XhoX PstI 
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10 20 
Hat lys Ala Ila Phs Val Uu Lya Cly Sar Uu Asp Arg Asp Uu Asp Ssr Arc Uu Asp 
ATC AAA CCA ATT TTC CTA CTC AAA OCT TCA CTC CAC ACA CAT CTC CAC TCT CCT CTC CAT 

BglZZ 

30 40 
Uu Asp VsX Arg Thr Asp His Lys Asp Uu Ssr Asp His Uu Vsl Uu Vsl Asp Uu Ala 
CTC CAC CTT CCT ACC CAC CAC AAA CAC CTC TCT CAT CAC CTC CTT CTC CTC CAC CTC CCT 

BdZ 8*11 
SO 60 
At? Asn Asp Uu Ala Arc Zla Val Thr Pro Cly Sar Arg Tyr Val Ala Asp Uu Glu Pha 
CCT AAC CAC CTC CCT CCT ATC CTT ACT CCC CGC TCT CCT TAC CTT CCG CAT CTC GAA TTC 

S»aZ EeoJU 

CAT r 1 ' 



EcoRI 




FlGi. 10 6 



Afim 
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94- - 

5 43- 

h 29- - 

* 20.1 - - 

14.4- - 



3 



0 12 3 4 5 

FIG.. " 



DVQLQESCPGLVKPSQSLSLTCSVTGYSIT 
SCYFWNHX RQFPGNKLEWLGFIKYDGSNYG 
NPSLKNRVS 2 TRDTSENQFFLKLDSVTTAT 
YYCACDHDHLYFDYWGQGTTLTVS 

CCGCSCCCGSGGGGS 

QAVVTOESALTTSPGGTVI LTCRSSTGAVT 
TSNYANW1QEKPDHLFTGLZGGTSNRAPGV 
PVRFSGSLI GDKAALTXTCAQTEDDAMYFC 
ALWFRNHFVFGGGTKVTVLG 



FIG. 9C 
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26-10sFv 

26- tO Fob 




J ! I I I ; I I I 

icr 4cr° to** io* # io* T icr* <<)*•• 

INHIBITOR CONCENTRATION fMj 
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LOG UNBOUND DIGOXIN CONCENTRATION (MJ 
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10 20 30 40 50 60 

CAATTCATGGCTGACAACAAATTCAACAAGGAACAGCAGAACGCGTTCTACGAGATCTTG 

EFMADHKFNKEQQNAFYEZL 
EcoRI Mlul BglZZ 



70 80 90 100 110 120 

CACCTGCCGAACCTGAACGAAGAGCAGCGTAACGGCTTCATCCAAAGCTTGAAGGATGAG 
HLPNLNEEQRNGFIQSLKDE 
BspMI* Hindlll 

130 140 150 160 170 180 

CCCTCTCAGTCTCCCAATCTGCTAGCGGATGCCAAGAAACTCAACGATGCGCAGGCACCG 
PSQSAKL LADAKKLNDAQAP 

Nhel Fspl 

190 200 210 220 230 240 

AAATCGGATCAGCGGCAATTCATGGCTGACAACAAATTCAACAAGGAACAGCAGAACGCG 
KSDQCQFMADNKFNKEQQNA 

Klul 
Xanl 

250 260 270 280 290 300 

TTCTACGAGATCTTGCACCTGCCGAACCTGAACGAAGAGCAGCGTAACGGCTTCATCCAA 
FYEILHLPMLNEEQRNGFIQ 
BglZI BspMZ* H 

310 320 330 340 350 360 

AGCTTGAAGGATGAGCCCTCTCAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAAC 

SLKDEPSQSANLLADAKKLN 
indZZZ nhmi 

370 380 .J 

GATGCCCAGGCACCCAAATCGGATCC HU. »T 

DAQAPKSDP 
Fspl BaaHZ 



47 



EP 0 623 679 A1 



(BABS) " 

10 30 30 40 50 60 70 

GGATCCCGTAACTCTGACTCTGAATGCCCGCTGAGCCACGACGCGTACTC^ 

GSCNS0SECPL8H06YCLR0GVCHY 
BamHI BsaX+ E*pX 

85 95 105 115 125 135 145 

ATCGAACCTCTGGACAAATACGCATGCAACT C C G TT G TAGGCTACATCCGTGACCGCTGCCAGTATCGCGATCTG 
I EALDK YACNCVVGY Z CERCQYRDL 

SphZ NruZ 

"0X70 t* \ r I C A 

AAATGGTCGGAGCTGCGTTAACTGCAG r I wl . 1 ^ n 

K W W E L R * 

HpaZ PstZ 
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(BABS)- 

10 20 30 40 50 60 

GGATCCCGTGGCGACCCGTCCAAGGACTCCXAAGCTCAGGTTrCTGCTGCCGAAGCTGGT 
GSGGDFSKDSKAQVSAAEAG 

BanHX 

70 80 90 100 110 120 

ATCACTGGCACCTGGTATAACCAACTGGGGTCGACTTTCATTGTGACCGCTGGTGCGGAC 
ITGTWYHQLG5TFXVTAGAD 

Sail 

130 140 150 160 170 180 

GGAGCTCTGACTGGCACCTACGAATCTGCCGTTGGTAACGCACAATCCCGCTACGTACTG 
GALTGTYESAVGNAESRYVL 
SacI SnaBI 

190 200 210 220 230 240 

ACTGGCCGTTATGACTCTCCACCTGCCACCCATCGCTCTGCTACCGCTCTCGGCTCGACT 
TGRYDSAPATDGSGTALGWT 

BspMI* Kpnl 

250 260 270 280 290 300 

GTGGCTTGGAAAAACAACTATCGTAATGCGCACACCCCCACTACGTGGTCTGGCCAATAC 
VAWKNHYRNAHSATTWSGQY 

Fspl Dr&III Ball 

PflMX BstXI 

310 320 330 340 350 360 

CTTGGCCGTCCTCAGGCTCGTATCAACACTCAGTGGCTGTTAACATCCCGCACTACCGAA 
VGGAEAR X NTQWLLTSGTTE 

Drain Hpal 

370 380 390 400 410 420 

GCGAATGCATGGAAATCGACACTAGTAGGTCATGACACCTTTACCAAAGTTAAGCCTTCT 
ANAWKSTLV GHDTFTKVKPS 
Bsml+ SpeX 
NsiX 

430 440 450 460 470 480 

GCTGCTAGCATTGATGCTGCCAAGAAAGCAGCCCTAAACAACGGTAACCCTCTAGACGCT 
AAS X DAAKKAGVNHGNPLDA 
Nh«X BstEXI Xbal 

490 500 

CTTCACCAATAACTGCAG r i r \ 5 <B 

v o Q * rival. 
p»tx 
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(BAfiS) - 

10 20 30 40 50 60 

CGATCCCCTCTACGTACCTCCTCTCCCACTCCGTCCCATAACCCCGTTCCTCATCTACTT 

GSGVRSSSRTPSDKPVAHVV 
BanHI SnafiZ 

70 80 90 X00 110 120 

GCTAACCCTCAGGCAGAAGGTCAGCTTCAGTGGCTGAACCGTCGCGCTAAC6CCCT6CT6 
ANPQAEGQLQWLNRRANALL 
HatH BglZ 

130 140 150 160 170 180 

GCAAACGG CGTTGAG CTC CGTGATAACCAGCTCGTGGTACCTTCTGAAGGTCTGTACCTG 
ANGVELRDNQLV VPSEGLYL 
Sad PflKZ Kpnl 

190 200 210 220 230 240 

ATCTATTCTCAAGTACTGTTCAAGGGTCAGGGCTGCCCGTCGACTCATGTTCTGCTGACT 
IYS QVLFKGQGCPSTHVLLT 
Seal Sail 

250 260 270 280 290 300 

CACACCATCAGCCGTATTGCTGTATCTTACCAGACCAAAGTTAACCTGCTGAGCGCTATC 
HTISRIAVSYQTKVNLLSAI 

HpaIBspMI+ EC047III 
Espl 

310 320 330 340 350 360 

AAGTCTCCGTGCCAGCGTGAAACTCCCCAGGGTGCAGAAGCGAAACCATGGTATCAACCG 
KSPCQRETPEGAEAKPWYEP 

Ncol 

370 380 390 400 410 420 

ATCTACCTGGGTGGCGTATTTCAACTGGAGAAAGGTGACCGTCTGTCCGCAGAAATCAAC 
IYLGGVFQLEKGDRLSAEIN 

BstEII 

430 440 450 460 470 480 

CGTCCTGACTATCTAGATTTCGCTGAATCTGGCCAGGTGTACTTCGGTATTATCGCACTG 
RPDYLDFAESGQ VYFGI I A L 
Xbal Ball 



Fife. »5C 



490 

TAACTGCAG 

PstI 
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(BAB5) - 

10 20 30 40 50 60 

CGATCCCGTCCTCATCAGCTCACTCACCACCACATCCCTCAATTTAAACACCCTTTCTCT 

GSGADQLTDEQIAEFKEAFS 
Ba&HX BclIPvuXI DraX 

70 80 90 100 110 120 

CTGTTTGACAAAGACGGTGACGGTACCATCACTACCAAAGAGCTCGGCACCGTTATGCGC 
LFDKDGDGTITTKELG T V M R 

XpnZ SacX FspX 

130 140 150 160 170 180 

AGCCTTGG C CAG AACCCGACTG AAG CTG AATTG CAGG ACATGATCAACG AAGTCG ACG CT 
SLGQNPTE AELQDMINEVDA 
Ball Bell Sail 

190 200 210 220 230 240 

GACGGTAACGGCACCATCGATTTTCCGGAATTTCTGAACCTGATGGCGCGCAAGATGAAA 
DGNGT I DFPEFLNLMARKMK 
Clal BspMII BssHII 

250 260 270 280 290 300 

GACACTGACTCTGAAGAGGAACTGAAAGAGGCCTTCCGTGTTTTCGACAAAGACGGTAAC 
DTDSEEELKEAFRVFDKDGN 

StUl 

310 320 330 340 350 360 

GGTTTCATCTCGGCCGCTGAACTGCGTCACGTTATGACTAACCTGGGTGAAAAGCTTACT 
GFI SAAELRHVMTNLGEKLT 
Eagl Hindi II 

370 380 390 400 410 420 

GACGAAGAAGTTGACGAAATGATTCGCGAAGCTGACGTCGATGGTGACGGCCAGGTTAAC 
DEEVDEMIREADVDGDGQVN 
XmnI Hrul Aatll Hpal 

430 440 450 

TACGAAGAGTTCGTTCACGTTATGATGGCTAAGTAACTGCAG C\Cn 1 5 D 
YEEFVQVMMAK* r W1 • 

PatI 



51 



EP 0 623 679 A1 



(BABS) - 

10 20 30 40 50 60 

CGATCCCCTGCAGCCTCTCTCCGCTCTCTCACTATTCCCCAACCCCCXATCATTCCTGAA 

GSGGGSLGSLTXAEPAMXAE 
BamHI Bgll Bso 

70 80 90 100 110 120 

TGCAAGACTCGTACCGAAGTCTTCGAGATCTCTCGTCGTCTGATCGATCGCACTAATGCC 

CKTRTEVFEISRRLIDR TNA 
1+ Bglll Clal Be 

Pvul 

130 140 150 160 170 180 

AACTTCCTGGTATGGCCGCCGTGCGTCGAGGTACAACGCTGCTCCGGGTGTTGCAACAAT 

N FLVH P P C VEVQRC 5 G CCHN 
tXl 

190 200 210 220 230 240 

CGTAACGTTCAATGTCGACCGACTCAAGTCCAGCTGCGTCCGGTCCAAGTCCGCAAAATC 
RNVQCRPTQVQLRPVQVRKI 
Sail PvuII 

250 260 270 280 290 300 

GAGATTGTACGTAAGAAACCGATCTTTAAGAAGGCCACTGTTACTCTGGAAGACCATCTG 
EIVRKKPI FKKATVTL.EDHL 
SnaBI 

310 320 330 340 350 

G CATG CAAATGTG AG ACTGT AG CGG C CG CACGTCCAGTTACTTAACTGCAG 

ACKCETVAAARPVT * 
SphI EagI PstI 

NotI 

Fife. I5E 



52 



EP 0 623 679 A1 



(BABS) - 

10 20 30 40 50 60 

GCATCCCCTATATTCCCCAAACAATACCCAATTATAAACTTTACCACACCCGCTGCCACT 

GSGZFPKQYPZ XNFTTAGAT 
BamHZ 

70 80 90 100 110 120 

GTCCAAAGCTACACAAACTTTATCAGAGCTGTTCGCGGTCGTTTAACAACTGGAGCTGAT 
VQSYTNFZRAVRGRLTTGAD 

130 140 130 160 170 180 

GTGAGACATGAAATACCAGTGTTGCCAAACAGAGTTGCTTTGCCTATAAACCAACGGTTT 
VRH EI PVLPNRVGLPZNQRF 

190 200 210 220 230 240 

ATTTTAGTTGAACTCTCAAATCATGCAGAGCTTTCTGTTACATTAGCGCTGGATGTCACC 
ILVELSNHAELSVTLALDVT 

EC047III 

250 260 270 280 290 300 

AATG CATATGTGGTCGC CTACCGTCCTCGAAATACCG CATAT T T CTTTCATCCTG ACAAT 

NAYVVGYRACNSAY FFHPDN 
Ndel 

Nsil 

310 320 330 340 350 360 

CAGGAAGATGCAGAAGCAATCACTCATCTTTTCACTGATGTTCAAAATCGATATACATTC 
QE OAEA ITHLFTDVQNRYTF 

Clal 

370 380 390 400 410 420 

GCCTTTGGTCGTAATTATGATAGACTTGAACAACTTGCTGGTAATCTGAGAGAAAATATC 
AFGGNYDRLEQLAGNLRE NI 

430 440 450 460 470 480 

GAGTTGGGAAATGGTCCACTAGAGCAGGCTATCTCAGCGCTTTATTATTACAGTACTGGT 
ELCNCPLEEAISALYYYSTG 

Eco47III Seal 

490 500 510 520 530 540 

GGCACTCAGCTTCCAACTCTGGCTCGTTCCTTTATAATTTGCATCCAAATGATTTCAGAA 
GTQLPTZ«ARSFZZCZQMZSE 

550 560 570 580 590 600 

GCAGCAAGATTCCAATATATTGAGGGAGAAATGCCCACCACAATTACGTACAACCCGAGA 
AARFQYZ EGEMRTRZRYN RR 

FspX Bgl 



53 



EP 0 623 679 A1 



(BABS) - 

10 20 30 40 50 60 

CCATCCGGTGCTCCGACTTCTAGCTCTACTAAGAAAACTCACCTTCACCTCCAACACCTC 

GSGAPTSSSTKKTQLQLEHL 
BanHI PvuII 

70 80 90 100 110 120 

CTGCTGGACCTTCAGATGATCCTGAACGGTATCAACAACTACAAGAACCCGAAACTGACT 
LLDLQMILNGINNYKNPKLT 

130 140 150 160 170 180 

CGTATGCTGACTTTCAAATTCTACATGCCGAAGAAAGCTACCGAACTGAAACACCTTCAG 
RMLTFKFYMPKKATELKHLQ 

190 200 210 220 230 240 

TGCCTGGAAGAAGAACTGAAGCCGCTGGAGGAAGTACTGAACCTGGCTCAGTCTAAAAAC 
CLEEELKPLEEVLNLAQS KN 

Seal 

250 260 270 280 290 300 

TTCCACCTGCGTCCGCGTGACCTGATCAGCAACATCAACGTAATCGTTCTAGAACTTAAA 
FHLRPRDLISNINVIVLELK 

Bell Xbal 

310 320 330 340 350 360 

GGCTCTGAAACTACCTTCATGTGCGAATACGCTGACGAAACTGCTACCATCGTAGAATTT 
CSETTFMCEYADETATIVEF 

370 380 390 400 410 420 

CTGAACCGTTGGATCACCTTCTGCCAGTCTATCATCTCTACTCTGACTTAACTGCAG 
LNRWITFCQSI ISTL T* 

PstI 
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(BABS) - 

10 20 30 40 50 60 

GCATCCGGTGCTGACAACAAATTCAACAAGGAACAGCAGAACGCGTTCTACGAGATCTTG 

G S G A D N K F N KEQQNAFYEZL 
BamHI MluZ BglZX 

XnnZ 

70 80 90 100 110 120 

CACCTGCCGAACCTGAACGAAGAGCAGCGTAACGCCTTCATCCAAACCTTGAAGCATGAC 
HLPNLNEEQRNCFIQSLKDE 
BspMI+ HlndZZI 

130 140 150 160 170 180 

CCCTCTCAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAACGATGCGCAGGCACCG 
PSQ5AHLLA DAKKLN0AQAP 

NhftZ FspZ 

190 200 210 220 230 240 

AAATCGGATCAGGGGCAATTCATGGCTGACAACAAATTCAACAAGGAACAGCAGAACGCC 
KS DQGQFMADNKFNK EQQNA 

Mlul 
XmnI 

250 260 270 280 290 300 

TTCTACGAGATCTTGCACCTGCCGAACCTGAACGAAGAGCAGCGTAACGGCTTCATCCAA 
FYEI LHLPNLNEEQRNGFIQ 
Bglll BspMI+ H 

310 320 330 340 350 360 

AGCTTGAAGGATGAGCCCTCTCAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAAC 

SLKDEPSQSANLLADAKKLN 
indlll Nhel 

370 380 Cxr iCU 

GATGCGCAGGCACCGAAATAACTCCAG P I wl. 9 w 

D A Q A P K * 
Fspl PstX 
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